The \((X'X)^{-1}\) for the \(y=β_0+β_1 x_1+β_2 x_2+β_3 x_3+β_4 x_4+β_5 x_5+β_6 x_6+ε\) is given below.
If MSE = 1.395 and n = 38 , compute the
\[se(\mathbf{\hat\beta_4})=\sqrt{MSE\times C_{55}}=\sqrt{1.395\times0.069}=0.3102499\]
\[Cov(\mathbf{\hat\beta_2,\hat\beta_4})=MSE\times C_{35}=1.395\times(-0.035)=-0.048825\]
\[se(\mathbf{\hat\beta_2})=\sqrt{MSE\times C_{33}}=\sqrt{1.395\times0.067}=0.3057205\]
\[Cor(\mathbf{\hat\beta_2,\hat\beta_4})=\frac{Cov(\mathbf{\hat\beta_2,\hat\beta_4})}{se(\mathbf{\hat\beta_2})se(\mathbf{\hat\beta_4})}=\frac{-0.048825}{0.3057205\times0.3102499}=-0.5147615\]
\(C_{66}=0.058\) has the smallest value. \(\hatβ_5\) has the the least variance and is the most consistent among the estimators.
According to the \((X'X)^{(-1)}\), \(C_{13},\ C_{17},\ C_{24},\ C_{25},\ C_{67}\) are positive.
Therefore, the positively correlated pairs of parameters are
\[\hatβ_0\ \&\ \hatβ_2,\quad \hatβ_0\ \&\ \hatβ_6,\quad \hatβ_1\ \&\ \hatβ_3,\quad \hatβ_1\ \&\ \hatβ_4,\quad \hatβ_5\ \&\ \hatβ_6\]
Consider the following hypothesis: \(H_0: β_1=2β_3,β_2=β_3,β_5=0\)
\[ \mathbf{T}=\begin{bmatrix} 0 & 1 & 0 & -2 & 0 & 0& 0 \\ 0 & 0 & 1 & -1 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 1 & 0 \end{bmatrix}_{3\times7} \mathbf{β}=\begin{bmatrix} \beta_0 \\ \beta_1 \\ \beta_2 \\ \beta_3 \\ \beta_4 \\ \beta_5 \\ \beta_6 \end{bmatrix}_{7\times1} \mathbf{C}=\begin{bmatrix} 0 \\ 0 \\ 0\end{bmatrix}_{3\times1} rank(T)=3 \]
In this hypothesis, \(y=β_0+2β_3x_1+β_3x_2+β_3x_3+β_4x_4+0x_5+β_6x_6+ε=β_0+β_3(2x_1+x_2+x_3)+β_4x_4+β_6x_6+ε\)
The full model has 6 parameters for predictors while reduced model has 3.
\[F_0=\frac{\frac{SSE_{Reduced}-SSE_{Full}}{dfE_{Reduced}-dfE_{Full}}}{\frac{SSE_{Full}}{dfE_{Full}}}=\frac{\frac{SSE_{Reduced}-SSE_{Full}}{[n-(k+1-r)]-[n-(k+1)]}}{\frac{SSE_{Full}}{n-(k+1)}}=\frac{\frac{SSE_{Reduced}-SSE_{Full}}{[38-(6+1-3)]-[38-(6+1)]}}{\frac{SSE_{Full}}{38-(6+1)}}=\frac{\frac{SSE_{Reduced}-SSE_{Full}}{3}}{\frac{SSE_{Full}}{31}}\]
In this conceptual form, the numerator degrees of freedom is \(\nu_1=3\), denominator is \(\nu_2=31\).
After transformed to a colsed form, The numerator is 31, denominator is 3.
\[SSR=\sum_{i=1}^n(\hat y_i-\bar y)^2=\sum_{i=1}^n(\hat y_i^2-2\hat y_i\bar y+\bar y^2)=\sum_{i=1}^n\hat y_i^2-2\bar y\sum_{i=1}^n\hat y_i+\sum_{i=1}^n\bar y^2\]
\[=\sum_{i=1}^n\hat y_i^2-2\bar yn\frac{\sum_{i=1}^n\hat y_i}n+n\bar y^2=\sum_{i=1}^n\hat y_i^2-2\bar yn\bar y+n\bar y^2=\sum_{i=1}^n\hat y_i^2-n\bar y^2\]
Based on scatterplots and correlation, \(Cor(y,x_4)=0.866, Cor(y,x_1)=0.781, Cor(y,x_7)=0.668, Cor(y,x_2)=0.666\) have medium to strong positive linear relationship to the response variable (Correlation coefficient is more than 0.6). \(Cor(y,x_5)=-0.62\) has medium negative linear relationship to the response variable.
\[\hat y=292.561-203.144X_1+ 1055.782X_2-49.24X_3+209.762X_4-10.197X_5-24.558X_6+142.778X_7+511.713X_8-301.872X_9\]
The fitted overall model is statistically significant at 5% significance level (p-value=\(9.744\times^{-06}\)).
But most of the coefficients are not significent. This model is not the best fitted model.
On the residual plot, there is a funnel pattern.
On the outlier and leverage plot, there are two outliers.
On the qq plot, most of points follow approximately straight line but have some positive skew.
I suggest using natural log of response to make a variance-stabilizing transformations.
Other diagnostics of heteroskedasticity, variable selection, measures of influence also should be considered.
Accroding to the F test, the partial sum of squares explained by rainfall is 2209825, given that all the other regression coefficients are in the model.
According to the result of VIF test (variance inflation factor), the model does have serious problems of multicollinearity. The VIF of variables X4 (105.754708), X1 (101.859709), X3 (31.446394), X7(20.53505) are larger than 10.
Coefficient of 511.713 in the full model suggests the peak rate of flow increases by 511.713 cubic feet per second when the rainfall increases by 1 inch and other variables are constants.
\[\ln(\hat y)=3.402256-0.013532X_1-1.023664X_2+0.177966X_3+0.108788X_4\] \[-0.009622X_5-0.389474X_6+4.233475X_7+0.63007X_8-0.462276X_9\]
The overall fitted model is statistically significant at 5% significance level (p-value=\(7.513\times10^{-11}\)).
But most of the coefficients are not significent. This model is not the best fitted model.
The model still has serious problems of multicollinearity. The variance-stabilizing transformations does not change the value of VIF: X4 (105.754708), X1 (101.859709), X3 (31.446394), X7(20.53505).
If just considering the VIF, X4 (105.754708) or X1 (101.859709) with largest VIF values is the first to remove.
However, according to the correlation coefficients, X4, X1, and X7 have strongly correlation with ln(y) (\(Cor(ln(y),x_4)=0.896, Cor(ln(y),x_1)=0.726, Cor(ln(y),x_7)=0.592\)). The textbook suggest that the general approaches for dealing with multicollinearity include collecting additional data, model respecification (redefine the regressors, variable elimination), estimation methods (Ridge Regression, Principal-Component Regression). “Variable elimination is often a highly effective technique. However, it may not provide a satisfactory solution if the regressors dropped from the model have significant explanatory power relative to the response y. That is, eliminating regressors to reduce multicollinearity may damage the predictive power of the model.” (Montgomery et al., 2012. p.304) In this way, the third multicollinear X3 (31.446394) with a weak relationship with ln(y) (0.476) should be considered.
According to the variable names of X4, X1, and X3, they are geographic variables. Predictor X1 is the area of watershed while X4 is the longest stream flow in watershed, x3 is the average slope of watershed. For the given 6 watersheds, X1 and X4 are strongly related. A high correlation (0.921) is expected between these two variables. But X3 is not significently related with X1(-0.078) or X4 (0.245). Removing X3 might lose some irreplacable infromation. I
Actrually, I don’t agree remove any predictor in this stage. Removing any predictor can draw down the VIF significently. Before elimination, X4 contribute most of VIF. After elimination regression, the multicollinearity dissapeared in all the models. We should take more diagnostics and comparisons, gather sufficient evidents before removing any predictor.
| remove | Max.VIF | R-squared | Eliminated model | New Max.VIF | New R-squared |
|---|---|---|---|---|---|
| none | 106(X4) | 0.947 | / ,/ ,X3,X4,/ ,X6,X7,X8,X9 | X7=6.28 | 0.947 |
| X1 | 8.70(X4) | 0.947 | / ,/ ,X3,X4,/ ,X6,X7,X8,X9 | X7=6.28 | 0.947 |
| X2 | 67.4(X4) | 0.947 | / ,/ ,X3,X4,/ ,X6,X7,X8,X9 | X7=6.28 | 0.947 |
| X3 | 8.87(X4) | 0.941 | X1,X2,/ ,X4,X5,/ ,/ ,X8,X9 | X1=8.39 | 0.937 |
| X4 | 8.38(X1) | 0.945 | X1,/ ,X3,/ ,/ ,X6,X7,X8,X9 | X9=5.10 | 0.943 |
| X5 | 58.4(X4) | 0.947 | / ,/ ,X3,X4,/ ,X6,X7,X8,X9 | X7=6.28 | 0.947 |
| X6 | 54.8(X4) | 0.939 | X1,X2,/ ,X4,X5,/ ,/ ,X8,X9 | X1=8.39 | 0.937 |
| X7 | 42.0(X1) | 0.939 | X1,X2,/ ,X4,X5,/ ,/ ,X8,X9 | X1=8.39 | 0.937 |
| X8 | 103(X4) | 0.900 | X1,/ ,X3,/ ,/ ,X6,X7,/ ,/ | X7=3.96 | 0.893 |
| X9 | 100(X4) | 0.910 | X1,/ ,X3,/ ,/ ,X6,X7,X8,/ | X7=3.97 | 0.906 |
Before elimination, We can find removing x8 or x9 hurts the R-squared most, although they have low correlation with ln(y). Removing x1, x2, or x5 affects lesat. After elimination, We can get 3 kinds of model:‘346789’(\(R^2=0.947\)); ‘124589’(\(R^2=0.937\)); ‘1367(8)(9)’(\(R^2=0.943~0.906\)). X8 appeared in all the eliminated models unless we remove it manually. X2 and X5 apeared least than others.
Turning back to the correlation coefficients, X5 have stronger correlation with ln(y) than X2 (\(Cor(\ln(y),x_2)=0.658, Cor(\ln(y),x_5)=-0.723\)). We look at the context, X1 (Area of watershed) is a variable of terrain, X5 (surface absorbency index) and X2 (Area impervious to water) which are varibles of surface properties, which have similar effects on initial flow (see the discussion at the end).
Finaliy, Among the four variables (X2, X5, X6, X7) of surface porperties related with initial flow, X2 and X7 have highest correlation, which means X7 may contain the most same information contained in X2. Therefore, if we have to remove one variable, X2 is the best option. We also can conform it by the results of further transformation and interaction regression.
| X1 | X2 | X3 | X4 | X5 | X6 | X7 | X8 | X9 | ln(y) | |
|---|---|---|---|---|---|---|---|---|---|---|
| X2 | 0.80125933 | 1.00000000 | -0.07302375 | 0.76056089 | -0.48607440 | 0.06598115 | 0.8324480 | 0.10540203 | 0.13331333 | 0.65824739 |
| X5 | -0.73673561 | -0.48607440 | -0.40289059 | -0.77701188 | 1.00000000 | -0.27043338 | -0.4814787 | -0.04005129 | -0.10502725 | -0.72309770 |
Use Stepwise Forward Regression based on p values (use α=0.15)
\[\ln(\hat y)=2.872+0.168X_3+0.122X_4+3.106X_7\]
Use Stepwise AIC Forwardd Regression
\[\ln(\hat y)=2.692+0.184X_3+0.109X_4-0.368X_6+4.085X_7+0.612X_8-0.448X_9\]
Stepwise Backward Regression based on p values (use α=0.05) and Stepwise AIC Backward Regression have same results.
\[\ln(\hat y)=2.692+0.184X_3+0.109X_4-0.368X_6+4.085X_7+0.612X_8-0.448X_9\]
Best subsets method gives a same model.
\[\ln(\hat y)=2.692+0.184X_3+0.109X_4-0.368X_6+4.085X_7+0.612X_8-0.448X_9\]
| Method | By | Keep | Remove |
|---|---|---|---|
| Stepwise Forward | P=0.15 | X3, X4, X7 | X1,X2,X5,X6,X8,X9 |
| Stepwise Forward | AIC | X3,X4,X6,X7,X8,X9 | X1,X2,X5 |
| Stepwise Backward | P=0.05 | X3,X4,X6,X7,X8,X9 | X1,X2,X5 |
| Stepwise Backward | AIC | X3,X4,X6,X7,X8,X9 | X1,X2,X5 |
| Stepwise Both | P | X3, X4, X7 | X1,X2,X5,X6,X8,X9 |
| Stepwise Both | AIC | X3,X4,X6,X7,X8,X9 | X1,X2,X5 |
| Best Subset | p | X3,X4,X6,X7,X8,X9 | X1,X2,X5 |
| Best Subset | AIC | X3,X4,X6,X7,X8,X9 | X1,X2,X5 |
| all possible | / | X3,X4,X6,X7,X8,X9 | X1,X2,X5 |
Both models solved the problem of multicollinearity (VIF <10), and small P-values for F test. They don’t have serious violation of assumptions about the errors (There is no significant pattern on the plot of studentized residuals versus predicted values from the model with only one predictor. The partial regression plots do not show nonlinear patterns. The points follow approximately straight line on the qq plot). Both of Correlation between observed residuals and expected residuals under normality.The 6-predictor model got 0.9837263 P-value while the 6-predictor model got 0.9856766.
| Model | VIF | F | P-value(F) | MSR | MSE | \(R_{adjusted}^2\) | \(R_{Predict}^2\) | P-value(t) | Residuals Plots |
|---|---|---|---|---|---|---|---|---|---|
| 3-4-7 | <10 | 70.378 | \(1.312\times10^{-12}\) | 21.188 | 0.301 | 0.878 | 0.854 | Max=0.054 | Good enough |
| 3-4-6-7-8-9 | <10 | 68.16 | \(1.717\times10^{-13}\) | 11.265 | 0.165 | 0.933 | 0.908 | Max=0.019 | Good enough |
However, comparing to the 3-variable model, the 6-variable model has a higher (about by 6%) adjusted R square and higher (about by 5%) prediction R-square, which means it shows stronger predictive capability. All the coeficients in 6-predictors model are statistically significant higher than 98% significance level (the maximum p-values are 0.019, respectively). In the 3-variable model, X7 get a high p-value (0.054) which means not significant at 5% significance level. If we change the p-value as the parameter of forward selection, the same model will happened between \(\alpha\) equal 0.6 and 0.17. Further, considering the context, X8 and X9 are variables of precipitation. The 3-predictor model mean the peak flow is irrelevant with precipitation. It doesn’t make sense. Previous test in question (3) has found X8 and X9 are important. Therefore, the best model will be the model with 6 predictors.
| Model Summary | |||
|---|---|---|---|
| R | 0.973 | RMSE (Root Mean Square Error) | 0.407 |
| R-Squared | 0.947 | Coef. Var | 6.385 |
| Adj. R-Squared | 0.933 | MSE (Mean Square Error) | 0.165 |
| Pred R-Squared | 0.908 | MAE (Mean Absolute Error) | 0.273 |
| ANOVA | |||||
|---|---|---|---|---|---|
| Sum of Squares | DF | Mean Square | F | p-value | |
| Regression | 67.591 | 6 | 11.265 | 68.16 | \(1.717\times10^{-13}\) |
| Residual | 3.801 | 23 | 0.165 | ||
| Total | 71.393 | 29 |
|———–|———————-|———-|———–|——-|—————- —|———–|——— | model|Estimated coefficients|Partial SS|Std. Error |t test | p-value | 0.357 % | 99.643 % |(Intercept)| 2.69180 | / | 0.445 | 6.046 |\(3.63\times10^{-06}\)| 1.37732232| 4.00627202 | X3| 0.18384 | 5.37 | 0.032 | 5.698 |\(8.41\times10^{-06}\)| 0.08857700| 0.27911109 | X4| 0.10905 | 2.98 | 0.026 | 4.244 | 0.000306 | 0.03318876| 0.18491189 | X6| -0.36752 | 1.05 | 0.146 |-2.526 | 0.018898 |-0.79716475| 0.06213151 | X7| 4.08497 | 1.87 | 1.213 | 3.367 | 0.002662 | 0.50312634| 7.66681371 | X8| 0.61161 | 3.52 | 0.133 | 4.614 | 0.000122 | 0.22022202| 1.00298907 | X9| -0.44764 | 2.83 | 0.108 |-4.135 | 0.000402 |-0.76727751|-0.12799849
By SSR equal 67.591 and SSE equal 3.801, the adjusted R-squared is 0.9329. About 93.29% variation in the response is explained by the best model.
The value of PRESS is 6.538275. This model explains 90.8% of variation in predicting the peak rate of flow (in cfs) of water from six watersheds following storm episodes.
Linear regression is one of the Watershed Hydrological Model. Singh (1972) used linear models with a logarithm transformation of the variables. We retained the following
\[\ln(Dependent\ Variable)=β_0+β_1\ln P+β_2\ln Q+β_3\ln F_P+β_4CC+Interactions+\varepsilon\]
where the dependent variables can either be total storm flow volume (\(Q_t\)) in mm, quick flow volume (\(Q_f\)) in mm or peak flow (\(Q_{pk}\)) in \(m^3 sec^{−1} km^{−2}\). Independent variables were storm rainfall (\(P\)) in \(mm\), initial flow (\(Q_i\)) in \(mm h^{−1}\), rainfall frequency (\(F_p\)), the inverse of rainfall duration, in \(h^{−1}\) and a dummy variable (\(CC\)) representing the treatment effect on basin. \(CC\) was 0 and 1 for the calibration (1967–1992) and treated (1994–1998) periods, respectively. \(β_0\) to \(β_4\) are regression coefficients of the independent variables. All interactions between the independent variables were also tested for significance at \(α=0.10\). (Guillemette et al., 2005)
Inspired by this theory, I divided the variables to 4 groups:
| P | F | Q | |
|---|---|---|---|
| Precipitation | Time | Terrain | Surface |
| x8 | x9 | x1: Area of watershed (\(mi^2\)) | x2: Area impervious to water (\(mi^2\)) |
| Rainfall | Time period during | x3: Average slope of watershed (percent) | x5: surface absorbency index |
| (inches) | which rainfall exceeded | x4: Longest stream flow in watershed(1000s of feet) | x6: estimated soil storage capacity (inches of water) |
| ¼ inch/hour | x7: Infiltration rate of water into soil (inches/hour) |
It is resonable that the best should contain at least on variable in each group. In this way, X8 and X9 are indispensable variables. We have also know that Setpwise forward method could neglect this context and is not recommended for this question. When we try to simplify a model, we should evaluate the variable with others inside the groups.
| Transform | interact | Eliminate | model | number of perdictors | number of vairbales | R-squared |
|---|---|---|---|---|---|---|
| ln(all) | no | stepboth p | ln(X4) | 1 | 1 | 0.910 |
| ln(y) | no | Backward AIC | X3,X4,X6,X7,X8,X9 | 6 | 6 | 0.947 |
| ln(all) | no | stepboth AIC | ln(X1),ln(X3),ln(X4),ln(X6),ln(X8),ln(X9) | 6 | 6 | 0.983 |
| Mixed | yes | Backward AIC | X1:X3,X1:X4,ln(X8),ln(X9) | 7 | 5 | 0.984 |
| ln(all) | no | Backward AIC | ln(X1),ln(X3),ln(X5),ln(X6),ln(X8),ln(X9) | 6 | 6 | 0.986 |
| ln(y) | yes | Backward AIC | omitted | 15 | 9 | 0.994 |
| ln(all) | yes | Backward AIC | omitted | 22 | 9 | 0.998 |
It is obviously that the model of all-log with interaction is overfitting. Therefore, we should control the number of predictors. Since there are 6 predictors by elimination regression. A new model with higher R-squared and less than 7 predictors is better than the solusion in question (7). The best model should be
\[\ln(\hat y)=0.57120+0.72550\ln(X_1)+0.41866\ln(X_3)+1.25873\ln(X_5)-0.26702\ln(X_6)+1.62253\ln(X_8)-1.37489\ln(X_9)\]
If we try other combination mixed log and interaction, the results is interesting. For example, single variable model (log(X4)) has higher R-squared than 3-4-7 model. A mixed combination of X1:X3,X1:X4,ln(X8),ln(X9) with 5 variables has higher R-squared than 3-4-6-7-8-9 model. Due to time constraints, I didn’t tied all the possible combinations. It also might has a better answer if we use glm.
[1]: Montgomery, D. C., Peck, E. A., & Vining, G. G. (2012). Introduction to linear regression analysis (Vol. 821). John Wiley & Sons.
[2]: Guillemette, F., Plamondon, A. P., Prévost, M., & Lévesque, D. (2005). Rainfall generated stormflow response to clearcutting a boreal forest: peak flow comparison with 50 world-wide basin studies. Journal of hydrology, 302(1-4), 137-153.
library(tidyverse)
library(GGally)
library(olsrr)
library(car)
table_wf <- read_table2("WaterFlow.txt")
ggpairs(data=table_wf[c(1:10)])
# build the model
model_wf_full <- lm(y ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9, data=table_wf)
model_wf_full%>% summary()
##
## Call:
## lm(formula = y ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9,
## data = table_wf)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1404.21 -318.77 74.73 266.66 1274.30
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 292.56 4428.62 0.066 0.9480
## X1 -203.14 410.27 -0.495 0.6259
## X2 1055.78 9833.70 0.107 0.9156
## X3 -49.24 156.20 -0.315 0.7558
## X4 209.76 162.05 1.294 0.2103
## X5 -10.20 51.09 -0.200 0.8438
## X6 -24.56 303.53 -0.081 0.9363
## X7 142.78 3288.44 0.043 0.9658
## X8 511.71 209.74 2.440 0.0241 *
## X9 -301.87 172.00 -1.755 0.0945 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 609.3 on 20 degrees of freedom
## Multiple R-squared: 0.8214, Adjusted R-squared: 0.741
## F-statistic: 10.22 on 9 and 20 DF, p-value: 9.744e-06
Anova(model_wf_full)
## Anova Table (Type II tests)
##
## Response: y
## Sum Sq Df F value Pr(>F)
## X1 91022 1 0.2452 0.62589
## X2 4279 1 0.0115 0.91557
## X3 36893 1 0.0994 0.75585
## X4 622091 1 1.6756 0.21025
## X5 14790 1 0.0398 0.84381
## X6 2430 1 0.0065 0.93632
## X7 700 1 0.0019 0.96580
## X8 2209825 1 5.9523 0.02414 *
## X9 1143622 1 3.0804 0.09455 .
## Residuals 7425127 20
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Model Fit Assessment
ols_plot_diagnostics(model_wf_full)
# Part & Partial Correlations
ols_test_correlation(model_wf_full) # Correlation between observed residuals and expected residuals under normality.
## [1] 0.9710713
# Residual Normality Test
ols_test_normality(model_wf_full) # Test for detecting violation of normality assumption. #If p-value is bigger, then no problem of non-normality #
## -----------------------------------------------
## Test Statistic pvalue
## -----------------------------------------------
## Shapiro-Wilk 0.9589 0.2898
## Kolmogorov-Smirnov 0.1423 0.5314
## Cramer-von Mises 2.5333 0.0000
## Anderson-Darling 0.5169 0.1748
## -----------------------------------------------
#Lack of Fit F Test
ols_pure_error_anova(lm(y~X8, data = table_wf))
## Lack of Fit F Test
## ---------------
## Response : y
## Predictor: X8
##
## Analysis of Variance Table
## -------------------------------------------------------------------------
## DF Sum Sq Mean Sq F Value Pr(>F)
## -------------------------------------------------------------------------
## X8 1 4616882.92 4616882.92 5.795558 0.02290414
## Residual 28 36951252.44 1319687.59
## Lack of fit 21 31374881.28 1494041.97 1.875466 0.2003839
## Pure Error 7 5576371.17 796624.45
## -------------------------------------------------------------------------
# Variable Contributions
ols_plot_added_variable(model_wf_full)
# Residual Plus Component Plot
ols_plot_comp_plus_resid(model_wf_full)
# for full model
ols_vif_tol(model_wf_full)
## # A tibble: 9 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X1 0.00982 102.
## 2 X2 0.133 7.52
## 3 X3 0.0318 31.4
## 4 X4 0.00946 106.
## 5 X5 0.103 9.68
## 6 X6 0.433 2.31
## 7 X7 0.0487 20.5
## 8 X8 0.182 5.50
## 9 X9 0.174 5.75
# build full log model
table_wf_logy <- table_wf %>% mutate(logy=log(y))
table_wf_logy$y <- NULL
ggpairs(data=table_wf_logy[c(1:10)])
model_wf_full_log <- lm(log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9, data=table_wf)
summary(model_wf_full_log)
##
## Call:
## lm(formula = log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 +
## X9, data = table_wf)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.95298 -0.20764 0.01499 0.18100 0.67539
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.402256 3.150312 1.080 0.293006
## X1 -0.013532 0.291845 -0.046 0.963477
## X2 -1.023664 6.995235 -0.146 0.885120
## X3 0.177966 0.111113 1.602 0.124908
## X4 0.108788 0.115272 0.944 0.356560
## X5 -0.009622 0.036341 -0.265 0.793898
## X6 -0.389474 0.215916 -1.804 0.086345 .
## X7 4.233475 2.339245 1.810 0.085387 .
## X8 0.630070 0.149200 4.223 0.000418 ***
## X9 -0.462276 0.122350 -3.778 0.001181 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4334 on 20 degrees of freedom
## Multiple R-squared: 0.9474, Adjusted R-squared: 0.9237
## F-statistic: 40 on 9 and 20 DF, p-value: 7.513e-11
Anova(model_wf_full)
## Anova Table (Type II tests)
##
## Response: y
## Sum Sq Df F value Pr(>F)
## X1 91022 1 0.2452 0.62589
## X2 4279 1 0.0115 0.91557
## X3 36893 1 0.0994 0.75585
## X4 622091 1 1.6756 0.21025
## X5 14790 1 0.0398 0.84381
## X6 2430 1 0.0065 0.93632
## X7 700 1 0.0019 0.96580
## X8 2209825 1 5.9523 0.02414 *
## X9 1143622 1 3.0804 0.09455 .
## Residuals 7425127 20
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Model Fit Assessment
# ols_plot_diagnostics(model_wf_full_log)
# Part & Partial Correlations
# ols_test_correlation(model_wf_full_log) # Correlation between observed residuals and expected residuals under normality.
# Residual Normality Test
# ols_test_normality(model_wf_full_log) # Test for detecting violation of normality assumption. #If p-value is bigger, then no problem of non-normality #
library(dplyr)
## Start: AIC=-42.32
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## - X1 1 0.0004 3.7577 -44.322
## - X2 1 0.0040 3.7613 -44.293
## - X5 1 0.0132 3.7705 -44.220
## - X4 1 0.1673 3.9246 -43.018
## <none> 3.7573 -42.325
## - X3 1 0.4819 4.2392 -40.705
## - X6 1 0.6113 4.3686 -39.803
## - X7 1 0.6153 4.3726 -39.775
## - X9 1 2.6819 6.4392 -28.164
## - X8 1 3.3503 7.1076 -25.201
##
## Step: AIC=-44.32
## log(y) ~ X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## - X2 1 0.0110 3.7686 -46.234
## - X5 1 0.0267 3.7844 -46.110
## <none> 3.7577 -44.322
## - X6 1 1.0447 4.8023 -38.963
## - X7 1 1.5520 5.3097 -35.950
## - X4 1 1.8469 5.6046 -34.328
## - X9 1 2.8341 6.5918 -29.461
## - X8 1 3.4848 7.2425 -26.637
## - X3 1 5.0955 8.8532 -20.613
##
## Step: AIC=-46.23
## log(y) ~ X3 + X4 + X5 + X6 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## - X5 1 0.0327 3.8013 -47.975
## <none> 3.7686 -46.234
## - X6 1 1.0375 4.8061 -40.939
## - X4 1 1.8741 5.6428 -36.125
## - X7 1 1.9036 5.6722 -35.968
## - X9 1 2.8353 6.6040 -31.406
## - X8 1 3.4744 7.2430 -28.635
## - X3 1 5.1264 8.8951 -22.471
##
## Step: AIC=-47.98
## log(y) ~ X3 + X4 + X6 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## <none> 3.8013 -47.975
## - X6 1 1.0542 4.8555 -42.632
## - X7 1 1.8739 5.6752 -37.953
## - X9 1 2.8256 6.6270 -33.302
## - X4 1 2.9771 6.7784 -32.624
## - X8 1 3.5182 7.3195 -30.320
## - X3 1 5.3653 9.1666 -23.569
## Start: AIC=-44.32
## log(y) ~ X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## - X2 1 0.0110 3.7686 -46.234
## - X5 1 0.0267 3.7844 -46.110
## <none> 3.7577 -44.322
## - X6 1 1.0447 4.8023 -38.963
## - X7 1 1.5520 5.3097 -35.950
## - X4 1 1.8469 5.6046 -34.328
## - X9 1 2.8341 6.5918 -29.461
## - X8 1 3.4848 7.2425 -26.637
## - X3 1 5.0955 8.8532 -20.613
##
## Step: AIC=-46.23
## log(y) ~ X3 + X4 + X5 + X6 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## - X5 1 0.0327 3.8013 -47.975
## <none> 3.7686 -46.234
## - X6 1 1.0375 4.8061 -40.939
## - X4 1 1.8741 5.6428 -36.125
## - X7 1 1.9036 5.6722 -35.968
## - X9 1 2.8353 6.6040 -31.406
## - X8 1 3.4744 7.2430 -28.635
## - X3 1 5.1264 8.8951 -22.471
##
## Step: AIC=-47.98
## log(y) ~ X3 + X4 + X6 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## <none> 3.8013 -47.975
## - X6 1 1.0542 4.8555 -42.632
## - X7 1 1.8739 5.6752 -37.953
## - X9 1 2.8256 6.6270 -33.302
## - X4 1 2.9771 6.7784 -32.624
## - X8 1 3.5182 7.3195 -30.320
## - X3 1 5.3653 9.1666 -23.569
## Start: AIC=-44.29
## log(y) ~ X1 + X3 + X4 + X5 + X6 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## - X1 1 0.0073 3.7686 -46.234
## - X5 1 0.0370 3.7983 -45.999
## <none> 3.7613 -44.293
## - X4 1 0.3141 4.0754 -43.887
## - X6 1 0.7115 4.4729 -41.095
## - X3 1 0.7775 4.5388 -40.656
## - X7 1 1.2667 5.0280 -37.585
## - X9 1 2.7122 6.4735 -30.005
## - X8 1 3.4001 7.1614 -26.975
##
## Step: AIC=-46.23
## log(y) ~ X3 + X4 + X5 + X6 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## - X5 1 0.0327 3.8013 -47.975
## <none> 3.7686 -46.234
## - X6 1 1.0375 4.8061 -40.939
## - X4 1 1.8741 5.6428 -36.125
## - X7 1 1.9036 5.6722 -35.968
## - X9 1 2.8353 6.6040 -31.406
## - X8 1 3.4744 7.2430 -28.635
## - X3 1 5.1264 8.8951 -22.471
##
## Step: AIC=-47.98
## log(y) ~ X3 + X4 + X6 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## <none> 3.8013 -47.975
## - X6 1 1.0542 4.8555 -42.632
## - X7 1 1.8739 5.6752 -37.953
## - X9 1 2.8256 6.6270 -33.302
## - X4 1 2.9771 6.7784 -32.624
## - X8 1 3.5182 7.3195 -30.320
## - X3 1 5.3653 9.1666 -23.569
## Start: AIC=-40.7
## log(y) ~ X1 + X2 + X4 + X5 + X6 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## - X7 1 0.1337 4.3729 -41.773
## - X6 1 0.1858 4.4250 -41.418
## <none> 4.2392 -40.705
## - X2 1 0.2995 4.5388 -40.656
## - X5 1 1.1498 5.3891 -35.505
## - X9 1 3.5480 7.7872 -24.461
## - X8 1 4.0900 8.3292 -22.443
## - X1 1 4.6140 8.8532 -20.613
## - X4 1 13.7481 17.9873 0.654
##
## Step: AIC=-41.77
## log(y) ~ X1 + X2 + X4 + X5 + X6 + X8 + X9
##
## Df Sum of Sq RSS AIC
## - X6 1 0.1369 4.5098 -42.849
## <none> 4.3729 -41.773
## - X2 1 0.6577 5.0306 -39.570
## - X5 1 1.0236 5.3965 -37.464
## - X9 1 3.4161 7.7890 -26.455
## - X8 1 3.9564 8.3293 -24.442
## - X1 1 4.7933 9.1662 -21.570
## - X4 1 13.8200 18.1929 -1.005
##
## Step: AIC=-42.85
## log(y) ~ X1 + X2 + X4 + X5 + X8 + X9
##
## Df Sum of Sq RSS AIC
## <none> 4.5098 -42.849
## - X2 1 0.6110 5.1208 -41.037
## - X5 1 0.8871 5.3969 -39.461
## - X9 1 3.2799 7.7896 -28.452
## - X8 1 3.8347 8.3444 -26.388
## - X1 1 5.0057 9.5155 -22.448
## - X4 1 15.9600 20.4698 0.533
## Start: AIC=-43.02
## log(y) ~ X1 + X2 + X3 + X5 + X6 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## - X5 1 0.0457 3.9703 -44.671
## - X2 1 0.1508 4.0754 -43.887
## <none> 3.9246 -43.018
## - X1 1 1.6800 5.6046 -34.328
## - X6 1 2.1914 6.1160 -31.708
## - X9 1 2.5158 6.4404 -30.158
## - X8 1 3.1937 7.1183 -27.156
## - X7 1 4.3217 8.2463 -22.743
## - X3 1 14.0627 17.9873 0.654
##
## Step: AIC=-44.67
## log(y) ~ X1 + X2 + X3 + X6 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## - X2 1 0.1126 4.0829 -45.832
## <none> 3.9703 -44.671
## - X6 1 2.5195 6.4898 -31.929
## - X9 1 2.7581 6.7284 -30.846
## - X1 1 2.7838 6.7541 -30.731
## - X8 1 3.6308 7.6011 -27.187
## - X7 1 4.2769 8.2472 -24.740
## - X3 1 24.3256 28.2959 12.246
##
## Step: AIC=-45.83
## log(y) ~ X1 + X3 + X6 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## <none> 4.0829 -45.832
## - X6 1 2.4147 6.4976 -33.893
## - X9 1 2.6501 6.7330 -32.825
## - X1 1 2.6955 6.7784 -32.624
## - X8 1 3.5347 7.6176 -29.122
## - X7 1 5.2580 9.3409 -23.004
## - X3 1 25.3225 29.4054 11.399
## Start: AIC=-44.22
## log(y) ~ X1 + X2 + X3 + X4 + X6 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## - X1 1 0.0139 3.7844 -46.110
## - X2 1 0.0279 3.7983 -45.999
## - X4 1 0.1998 3.9703 -44.671
## <none> 3.7705 -44.220
## - X6 1 0.7524 4.5229 -40.762
## - X7 1 1.1605 4.9309 -38.170
## - X3 1 1.6186 5.3891 -35.505
## - X9 1 2.8181 6.5886 -29.476
## - X8 1 3.5442 7.3147 -26.339
##
## Step: AIC=-46.11
## log(y) ~ X2 + X3 + X4 + X6 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## - X2 1 0.0170 3.8013 -47.975
## <none> 3.7844 -46.110
## - X6 1 1.0707 4.8551 -40.635
## - X7 1 1.5504 5.3348 -37.808
## - X9 1 2.8243 6.6087 -31.384
## - X4 1 2.9697 6.7541 -30.731
## - X8 1 3.5305 7.3149 -28.339
## - X3 1 5.3638 9.1482 -21.629
##
## Step: AIC=-47.98
## log(y) ~ X3 + X4 + X6 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## <none> 3.8013 -47.975
## - X6 1 1.0542 4.8555 -42.632
## - X7 1 1.8739 5.6752 -37.953
## - X9 1 2.8256 6.6270 -33.302
## - X4 1 2.9771 6.7784 -32.624
## - X8 1 3.5182 7.3195 -30.320
## - X3 1 5.3653 9.1666 -23.569
## Start: AIC=-39.8
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## - X3 1 0.0564 4.4250 -41.418
## - X2 1 0.1043 4.4729 -41.095
## - X7 1 0.1349 4.5035 -40.891
## - X5 1 0.1543 4.5229 -40.762
## <none> 4.3686 -39.803
## - X1 1 0.4338 4.8023 -38.963
## - X4 1 1.7475 6.1160 -31.708
## - X9 1 2.6668 7.0353 -27.508
## - X8 1 3.1935 7.5620 -25.342
##
## Step: AIC=-41.42
## log(y) ~ X1 + X2 + X4 + X5 + X7 + X8 + X9
##
## Df Sum of Sq RSS AIC
## - X7 1 0.0848 4.5098 -42.849
## - X2 1 0.3043 4.7293 -41.423
## <none> 4.4250 -41.418
## - X5 1 0.9642 5.3892 -37.504
## - X9 1 3.3632 7.7882 -26.458
## - X8 1 3.9187 8.3437 -24.391
## - X1 1 4.6412 9.0662 -21.899
## - X4 1 16.0173 20.4423 2.492
##
## Step: AIC=-42.85
## log(y) ~ X1 + X2 + X4 + X5 + X8 + X9
##
## Df Sum of Sq RSS AIC
## <none> 4.5098 -42.849
## - X2 1 0.6110 5.1208 -41.037
## - X5 1 0.8871 5.3969 -39.461
## - X9 1 3.2799 7.7896 -28.452
## - X8 1 3.8347 8.3444 -26.388
## - X1 1 5.0057 9.5155 -22.448
## - X4 1 15.9600 20.4698 0.533
## Start: AIC=-39.78
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X8 + X9
##
## Df Sum of Sq RSS AIC
## - X3 1 0.0003 4.3729 -41.773
## - X6 1 0.1309 4.5035 -40.891
## <none> 4.3726 -39.775
## - X5 1 0.5584 4.9309 -38.170
## - X2 1 0.6554 5.0280 -37.585
## - X1 1 0.9371 5.3097 -35.950
## - X9 1 3.0659 7.4385 -25.836
## - X8 1 3.6837 8.0563 -23.442
## - X4 1 3.8737 8.2463 -22.743
##
## Step: AIC=-41.77
## log(y) ~ X1 + X2 + X4 + X5 + X6 + X8 + X9
##
## Df Sum of Sq RSS AIC
## - X6 1 0.1369 4.5098 -42.849
## <none> 4.3729 -41.773
## - X2 1 0.6577 5.0306 -39.570
## - X5 1 1.0236 5.3965 -37.464
## - X9 1 3.4161 7.7890 -26.455
## - X8 1 3.9564 8.3293 -24.442
## - X1 1 4.7933 9.1662 -21.570
## - X4 1 13.8200 18.1929 -1.005
##
## Step: AIC=-42.85
## log(y) ~ X1 + X2 + X4 + X5 + X8 + X9
##
## Df Sum of Sq RSS AIC
## <none> 4.5098 -42.849
## - X2 1 0.6110 5.1208 -41.037
## - X5 1 0.8871 5.3969 -39.461
## - X9 1 3.2799 7.7896 -28.452
## - X8 1 3.8347 8.3444 -26.388
## - X1 1 5.0057 9.5155 -22.448
## - X4 1 15.9600 20.4698 0.533
## Start: AIC=-25.2
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X9
##
## Df Sum of Sq RSS AIC
## - X9 1 0.00002 7.1076 -27.201
## - X4 1 0.01076 7.1183 -27.156
## - X2 1 0.05385 7.1614 -26.975
## - X1 1 0.13496 7.2425 -26.637
## - X5 1 0.20709 7.3147 -26.340
## - X6 1 0.45446 7.5620 -25.342
## <none> 7.1076 -25.201
## - X7 1 0.94872 8.0563 -23.442
## - X3 1 1.22165 8.3292 -22.443
##
## Step: AIC=-27.2
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7
##
## Df Sum of Sq RSS AIC
## - X4 1 0.01127 7.1189 -29.154
## - X2 1 0.05390 7.1615 -28.974
## - X1 1 0.13699 7.2446 -28.628
## - X5 1 0.20769 7.3153 -28.337
## - X6 1 0.45911 7.5667 -27.323
## <none> 7.1076 -27.201
## - X7 1 0.95446 8.0621 -25.421
## - X3 1 1.25421 8.3618 -24.326
##
## Step: AIC=-29.15
## log(y) ~ X1 + X2 + X3 + X5 + X6 + X7
##
## Df Sum of Sq RSS AIC
## - X2 1 0.1409 7.2598 -30.5654
## - X5 1 0.4878 7.6066 -29.1654
## <none> 7.1189 -29.1535
## - X6 1 1.1498 8.2686 -26.6619
## - X1 1 2.6634 9.7823 -21.6187
## - X7 1 3.8818 11.0007 -18.0972
## - X3 1 17.0862 24.2051 5.5609
##
## Step: AIC=-30.57
## log(y) ~ X1 + X3 + X5 + X6 + X7
##
## Df Sum of Sq RSS AIC
## - X5 1 0.3665 7.6263 -31.0878
## <none> 7.2598 -30.5654
## - X6 1 1.1128 8.3726 -28.2870
## - X1 1 2.6550 9.9148 -23.2150
## - X7 1 4.5672 11.8270 -17.9245
## - X3 1 19.3750 26.6348 6.4306
##
## Step: AIC=-31.09
## log(y) ~ X1 + X3 + X6 + X7
##
## Df Sum of Sq RSS AIC
## <none> 7.626 -31.088
## - X6 1 1.5290 9.155 -27.606
## - X1 1 2.7319 10.358 -23.902
## - X7 1 4.8252 12.452 -18.381
## - X3 1 26.0905 33.717 11.504
## Start: AIC=-28.16
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8
##
## Df Sum of Sq RSS AIC
## - X4 1 0.00125 6.4404 -30.158
## - X2 1 0.03429 6.4735 -30.005
## - X5 1 0.14939 6.5886 -29.476
## - X1 1 0.15258 6.5918 -29.461
## <none> 6.4392 -28.164
## - X6 1 0.59615 7.0353 -27.508
## - X8 1 0.66842 7.1076 -27.201
## - X7 1 0.99933 7.4385 -25.836
## - X3 1 1.34802 7.7872 -24.462
##
## Step: AIC=-30.16
## log(y) ~ X1 + X2 + X3 + X5 + X6 + X7 + X8
##
## Df Sum of Sq RSS AIC
## - X2 1 0.0670 6.5074 -31.848
## - X5 1 0.2880 6.7284 -30.846
## <none> 6.4404 -30.158
## - X8 1 0.6784 7.1189 -29.153
## - X6 1 1.2983 7.7387 -26.649
## - X1 1 2.0749 8.5153 -23.780
## - X7 1 3.5965 10.0369 -18.848
## - X3 1 16.2255 22.6659 5.590
##
## Step: AIC=-31.85
## log(y) ~ X1 + X3 + X5 + X6 + X7 + X8
##
## Df Sum of Sq RSS AIC
## - X5 1 0.2255 6.7330 -32.825
## <none> 6.5074 -31.848
## - X8 1 0.7523 7.2598 -30.565
## - X6 1 1.2782 7.7856 -28.468
## - X1 1 2.1911 8.6986 -25.141
## - X7 1 4.5538 11.0612 -17.933
## - X3 1 18.9765 25.4839 7.105
##
## Step: AIC=-32.83
## log(y) ~ X1 + X3 + X6 + X7 + X8
##
## Df Sum of Sq RSS AIC
## <none> 6.733 -32.825
## - X8 1 0.8934 7.626 -31.088
## - X6 1 1.6647 8.398 -28.197
## - X1 1 2.4940 9.227 -25.372
## - X7 1 4.7585 11.491 -18.788
## - X3 1 26.5284 33.261 13.096
# Compare vif
ols_vif_tol(model_wf_full_log)
## # A tibble: 9 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X1 0.00982 102.
## 2 X2 0.133 7.52
## 3 X3 0.0318 31.4
## 4 X4 0.00946 106.
## 5 X5 0.103 9.68
## 6 X6 0.433 2.31
## 7 X7 0.0487 20.5
## 8 X8 0.182 5.50
## 9 X9 0.174 5.75
ols_vif_tol(model_wf_aic_log)
## # A tibble: 6 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X3 0.332 3.01
## 2 X4 0.167 5.97
## 3 X6 0.839 1.19
## 4 X7 0.159 6.28
## 5 X8 0.202 4.94
## 6 X9 0.195 5.12
ols_vif_tol(model_wf_rm1_log)
## # A tibble: 8 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X2 0.245 4.08
## 2 X3 0.318 3.14
## 3 X4 0.115 8.70
## 4 X5 0.283 3.54
## 5 X6 0.717 1.39
## 6 X7 0.118 8.46
## 7 X8 0.190 5.27
## 8 X9 0.185 5.41
ols_vif_tol(model_wf_rm1_aic_log)
## # A tibble: 6 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X3 0.332 3.01
## 2 X4 0.167 5.97
## 3 X6 0.839 1.19
## 4 X7 0.159 6.28
## 5 X8 0.202 4.94
## 6 X9 0.195 5.12
ols_vif_tol(model_wf_rm2_log)
## # A tibble: 8 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X1 0.0181 55.2
## 2 X3 0.0583 17.2
## 3 X4 0.0148 67.4
## 4 X5 0.163 6.13
## 5 X6 0.543 1.84
## 6 X7 0.114 8.79
## 7 X8 0.183 5.46
## 8 X9 0.175 5.72
ols_vif_tol(model_wf_rm2_aic_log)
## # A tibble: 6 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X3 0.332 3.01
## 2 X4 0.167 5.97
## 3 X6 0.839 1.19
## 4 X7 0.159 6.28
## 5 X8 0.202 4.94
## 6 X9 0.195 5.12
ols_vif_tol(model_wf_rm3_log)
## # A tibble: 8 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X1 0.0983 10.2
## 2 X2 0.243 4.11
## 3 X4 0.113 8.87
## 4 X5 0.272 3.68
## 5 X6 0.767 1.30
## 6 X7 0.206 4.85
## 7 X8 0.190 5.26
## 8 X9 0.187 5.36
ols_vif_tol(model_wf_rm3_aic_log)
## # A tibble: 6 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X1 0.119 8.39
## 2 X2 0.313 3.19
## 3 X4 0.123 8.12
## 4 X5 0.343 2.92
## 5 X8 0.209 4.79
## 6 X9 0.205 4.88
ols_vif_tol(model_wf_rm4_log)
## # A tibble: 8 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X1 0.119 8.38
## 2 X2 0.209 4.79
## 3 X3 0.379 2.64
## 4 X5 0.187 5.35
## 5 X6 0.836 1.20
## 6 X7 0.165 6.05
## 7 X8 0.187 5.35
## 8 X9 0.183 5.45
ols_vif_tol(model_wf_rm4_aic_log)
## # A tibble: 6 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X1 0.279 3.58
## 2 X3 0.768 1.30
## 3 X6 0.917 1.09
## 4 X7 0.251 3.99
## 5 X8 0.202 4.94
## 6 X9 0.196 5.10
ols_vif_tol(model_wf_rm5_log)
## # A tibble: 8 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X1 0.0269 37.2
## 2 X2 0.210 4.77
## 3 X3 0.0836 12.0
## 4 X4 0.0171 58.4
## 5 X6 0.485 2.06
## 6 X7 0.0774 12.9
## 7 X8 0.200 5.01
## 8 X9 0.190 5.25
ols_vif_tol(model_wf_rm5_aic_log)
## # A tibble: 6 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X3 0.332 3.01
## 2 X4 0.167 5.97
## 3 X6 0.839 1.19
## 4 X7 0.159 6.28
## 5 X8 0.202 4.94
## 6 X9 0.195 5.12
ols_vif_tol(model_wf_rm6_log)
## # A tibble: 8 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X1 0.0162 61.5
## 2 X2 0.167 6.00
## 3 X3 0.0563 17.8
## 4 X4 0.0182 54.8
## 5 X5 0.116 8.64
## 6 X7 0.0832 12.0
## 7 X8 0.182 5.49
## 8 X9 0.174 5.75
ols_vif_tol(model_wf_rm6_aic_log)
## # A tibble: 6 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X1 0.119 8.39
## 2 X2 0.313 3.19
## 3 X4 0.123 8.12
## 4 X5 0.343 2.92
## 5 X8 0.209 4.79
## 6 X9 0.205 4.88
ols_vif_tol(model_wf_rm7_log)
## # A tibble: 8 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X1 0.0238 42.0
## 2 X2 0.310 3.22
## 3 X3 0.135 7.42
## 4 X4 0.0321 31.1
## 5 X5 0.164 6.09
## 6 X6 0.740 1.35
## 7 X8 0.184 5.45
## 8 X9 0.177 5.66
ols_vif_tol(model_wf_rm7_aic_log)
## # A tibble: 6 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X1 0.119 8.39
## 2 X2 0.313 3.19
## 3 X4 0.123 8.12
## 4 X5 0.343 2.92
## 5 X8 0.209 4.79
## 6 X9 0.205 4.88
ols_vif_tol(model_wf_rm8_log)
## # A tibble: 8 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X1 0.0103 97.5
## 2 X2 0.134 7.46
## 3 X3 0.0333 30.0
## 4 X4 0.00973 103.
## 5 X5 0.114 8.81
## 6 X6 0.435 2.30
## 7 X7 0.0492 20.3
## 8 X9 0.879 1.14
ols_vif_tol(model_wf_rm8_aic_log)
## # A tibble: 4 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X1 0.281 3.56
## 2 X3 0.773 1.29
## 3 X6 0.949 1.05
## 4 X7 0.252 3.96
ols_vif_tol(model_wf_rm9_log)
## # A tibble: 8 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X1 0.0104 95.8
## 2 X2 0.134 7.48
## 3 X3 0.0341 29.3
## 4 X4 0.00998 100.
## 5 X5 0.113 8.83
## 6 X6 0.433 2.31
## 7 X7 0.0495 20.2
## 8 X8 0.919 1.09
ols_vif_tol(model_wf_rm9_aic_log)
## # A tibble: 5 x 3
## Variables Tolerance VIF
## <chr> <dbl> <dbl>
## 1 X1 0.280 3.58
## 2 X3 0.771 1.30
## 3 X6 0.945 1.06
## 4 X7 0.252 3.97
## 5 X8 0.963 1.04
library(huxtable)
huxreg(model_wf_rm1_log, model_wf_rm2_log, model_wf_rm3_log, model_wf_rm4_log, model_wf_rm5_log, model_wf_rm6_log, model_wf_rm7_log, model_wf_rm8_log, model_wf_rm9_log, model_wf_full_log)
| (1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | (9) | (10) | |
| (Intercept) | 3.280 | 3.703 | 7.731 *** | 1.200 | 2.581 *** | 5.523 | 7.487 ** | -0.182 | -0.127 | 3.402 |
| (1.690) | (2.333) | (1.678) | (2.110) | (0.547) | (3.075) | (2.314) | (4.072) | (3.844) | (3.150) | |
| X2 | -1.243 | 6.526 | -5.001 | -2.144 | 4.655 | 8.550 | -3.730 | -2.980 | -1.024 | |
| (5.025) | (5.358) | (5.568) | (5.445) | (6.574) | (4.819) | (9.350) | (8.912) | (6.995) | ||
| X3 | 0.183 *** | 0.167 * | 0.278 *** | 0.201 ** | 0.046 | 0.002 | 0.277 | 0.287 * | 0.178 | |
| (0.034) | (0.080) | (0.032) | (0.067) | (0.088) | (0.057) | (0.146) | (0.137) | (0.111) | ||
| X4 | 0.104 ** | 0.119 | 0.286 *** | 0.088 | 0.253 ** | 0.284 *** | 0.027 | 0.009 | 0.109 | |
| (0.032) | (0.090) | (0.035) | (0.084) | (0.087) | (0.066) | (0.153) | (0.143) | (0.115) | ||
| X5 | -0.008 | -0.013 | -0.055 * | 0.013 | -0.031 | -0.050 | 0.036 | 0.031 | -0.010 | |
| (0.021) | (0.028) | (0.023) | (0.027) | (0.036) | (0.030) | (0.047) | (0.044) | (0.036) | ||
| X6 | -0.396 * | -0.375 | -0.161 | -0.531 ** | -0.408 | -0.138 | -0.335 | -0.385 | -0.389 | |
| (0.164) | (0.188) | (0.168) | (0.155) | (0.199) | (0.174) | (0.289) | (0.276) | (0.216) | ||
| X7 | 4.317 ** | 3.975 * | 0.959 | 6.088 *** | 4.611 * | 1.517 | 5.230 | 5.352 | 4.233 | |
| (1.466) | (1.495) | (1.178) | (1.266) | (1.814) | (1.884) | (3.124) | (2.965) | (2.339) | ||
| X8 | 0.629 *** | 0.632 *** | 0.680 *** | 0.606 *** | 0.618 *** | 0.614 *** | 0.657 *** | 0.125 | 0.630 *** | |
| (0.142) | (0.145) | (0.151) | (0.147) | (0.139) | (0.157) | (0.156) | (0.085) | (0.149) | ||
| X9 | -0.461 *** | -0.464 *** | -0.513 *** | -0.436 ** | -0.453 *** | -0.461 ** | -0.490 *** | 0.001 | -0.462 ** | |
| (0.116) | (0.119) | (0.122) | (0.119) | (0.114) | (0.129) | (0.128) | (0.073) | (0.122) | ||
| X1 | -0.042 | -0.457 *** | 0.250 ** | 0.048 | -0.345 | -0.418 * | 0.242 | 0.255 | -0.014 | |
| (0.210) | (0.096) | (0.083) | (0.173) | (0.239) | (0.197) | (0.383) | (0.362) | (0.292) | ||
| N | 30 | 30 | 30 | 30 | 30 | 30 | 30 | 30 | 30 | 30 |
| R2 | 0.947 | 0.947 | 0.941 | 0.945 | 0.947 | 0.939 | 0.939 | 0.900 | 0.910 | 0.947 |
| logLik | -11.407 | -11.422 | -13.216 | -12.059 | -11.458 | -13.667 | -13.681 | -20.968 | -19.486 | -11.406 |
| AIC | 42.815 | 42.843 | 46.432 | 44.118 | 42.916 | 47.333 | 47.361 | 61.935 | 58.972 | 44.811 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | ||||||||||
huxreg(model_wf_rm1_aic_log, model_wf_rm2_aic_log, model_wf_rm3_aic_log, model_wf_rm4_aic_log, model_wf_rm5_aic_log, model_wf_rm6_aic_log, model_wf_rm7_aic_log, model_wf_rm8_aic_log, model_wf_rm9_aic_log, model_wf_aic_log)
| (1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | (9) | (10) | |
| (Intercept) | 2.692 *** | 2.692 *** | 6.882 *** | 2.307 *** | 2.692 *** | 6.882 *** | 6.882 *** | 2.587 *** | 2.225 *** | 2.692 *** |
| (0.445) | (0.445) | (1.432) | (0.410) | (0.445) | (1.432) | (1.432) | (0.494) | (0.515) | (0.445) | |
| X3 | 0.184 *** | 0.184 *** | 0.263 *** | 0.184 *** | 0.266 *** | 0.268 *** | 0.184 *** | |||
| (0.032) | (0.032) | (0.022) | (0.032) | (0.029) | (0.028) | (0.032) | ||||
| X4 | 0.109 *** | 0.109 *** | 0.294 *** | 0.109 *** | 0.294 *** | 0.294 *** | 0.109 *** | |||
| (0.026) | (0.026) | (0.033) | (0.026) | (0.033) | (0.033) | (0.026) | ||||
| X6 | -0.368 * | -0.368 * | -0.532 ** | -0.368 * | -0.416 * | -0.435 * | -0.368 * | |||
| (0.146) | (0.146) | (0.144) | (0.146) | (0.186) | (0.179) | (0.146) | ||||
| X7 | 4.085 ** | 4.085 ** | 5.453 *** | 4.085 ** | 5.209 *** | 5.174 *** | 4.085 ** | |||
| (1.213) | (1.213) | (1.002) | (1.213) | (1.310) | (1.256) | (1.213) | ||||
| X8 | 0.612 *** | 0.612 *** | 0.629 *** | 0.613 *** | 0.612 *** | 0.629 *** | 0.629 *** | 0.141 | 0.612 *** | |
| (0.133) | (0.133) | (0.142) | (0.137) | (0.133) | (0.142) | (0.142) | (0.079) | (0.133) | ||
| X9 | -0.448 *** | -0.448 *** | -0.471 *** | -0.433 *** | -0.448 *** | -0.471 *** | -0.471 *** | -0.448 *** | ||
| (0.108) | (0.108) | (0.115) | (0.112) | (0.108) | (0.115) | (0.115) | (0.108) | |||
| X1 | -0.432 *** | 0.207 *** | -0.432 *** | -0.432 *** | 0.208 ** | 0.199 ** | ||||
| (0.086) | (0.053) | (0.086) | (0.086) | (0.070) | (0.067) | |||||
| X2 | 8.217 | 8.217 | 8.217 | |||||||
| (4.655) | (4.655) | (4.655) | ||||||||
| X5 | -0.043 * | -0.043 * | -0.043 * | |||||||
| (0.020) | (0.020) | (0.020) | ||||||||
| N | 30 | 30 | 30 | 30 | 30 | 30 | 30 | 30 | 30 | 30 |
| R2 | 0.947 | 0.947 | 0.937 | 0.943 | 0.947 | 0.937 | 0.937 | 0.893 | 0.906 | 0.947 |
| logLik | -11.581 | -11.581 | -14.144 | -12.652 | -11.581 | -14.144 | -14.144 | -22.024 | -20.155 | -11.581 |
| AIC | 39.161 | 39.161 | 44.288 | 41.305 | 39.161 | 44.288 | 44.288 | 56.049 | 54.311 | 39.161 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | ||||||||||
Stepwise Forward Regression for full model
# Stepwise Forward Regression based on p values (use a=0.15) #
ols_step_forward_p(model_wf_full_log, penter = 0.15)
# Stepwise AIC Forward Regression #
ols_step_forward_aic(model_wf_full_log)
Stepwise Forward Regression for X4 eliminated model
# Stepwise Forward Regression based on p values (use a=0.15) #
ols_step_forward_p(model_wf_rm4_log, penter = 0.15)
# Stepwise AIC Forward Regression #
ols_step_forward_aic(model_wf_rm4_log)
Stepwise Forward Regression for X1 eliminated model
# Stepwise Forward Regression based on p values (use a=0.15) #
ols_step_forward_p(model_wf_rm1_log, penter = 0.15)
# Stepwise AIC Forward Regression #
ols_step_forward_aic(model_wf_rm1_log)
Stepwise Backward Regression for full model
# Stepwise Backward Regression based on p values (use a=0.05) #
ols_step_backward_p(model_wf_full_log, penter = 0.05)
## Backward Elimination Method
## ---------------------------
##
## Candidate Terms:
##
## 1 . X1
## 2 . X2
## 3 . X3
## 4 . X4
## 5 . X5
## 6 . X6
## 7 . X7
## 8 . X8
## 9 . X9
##
## We are eliminating variables based on p value...
##
## Variables Removed:
##
## - X1
## - X2
## - X5
##
## No more variables satisfy the condition of p value = 0.3
##
##
## Final Model Output
## ------------------
##
## Model Summary
## -------------------------------------------------------------
## R 0.973 RMSE 0.407
## R-Squared 0.947 Coef. Var 6.385
## Adj. R-Squared 0.933 MSE 0.165
## Pred R-Squared 0.908 MAE 0.273
## -------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------
## Regression 67.591 6 11.265 68.16 0.0000
## Residual 3.801 23 0.165
## Total 71.393 29
## -------------------------------------------------------------------
##
## Parameter Estimates
## ----------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## ----------------------------------------------------------------------------------------
## (Intercept) 2.692 0.445 6.046 0.000 1.771 3.613
## X3 0.184 0.032 0.476 5.698 0.000 0.117 0.251
## X4 0.109 0.026 0.499 4.244 0.000 0.056 0.162
## X6 -0.368 0.146 -0.133 -2.526 0.019 -0.669 -0.066
## X7 4.085 1.213 0.406 3.367 0.003 1.575 6.595
## X8 0.612 0.133 0.493 4.614 0.000 0.337 0.886
## X9 -0.448 0.108 -0.450 -4.135 0.000 -0.672 -0.224
## ----------------------------------------------------------------------------------------
##
##
## Elimination Summary
## -----------------------------------------------------------------------
## Variable Adj.
## Step Removed R-Square R-Square C(p) AIC RMSE
## -----------------------------------------------------------------------
## 1 X1 0.9474 0.9273 8.0021 42.8146 0.4230
## 2 X2 0.9472 0.9304 6.0604 40.9019 0.4139
## 3 X5 0.9468 0.9329 4.2345 39.1611 0.4065
## -----------------------------------------------------------------------
# Stepwise AIC Backward Regression #
ols_step_backward_aic(model_wf_full_log)
## Backward Elimination Method
## ---------------------------
##
## Candidate Terms:
##
## 1 . X1
## 2 . X2
## 3 . X3
## 4 . X4
## 5 . X5
## 6 . X6
## 7 . X7
## 8 . X8
## 9 . X9
##
##
## Variables Removed:
##
## - X1
## - X2
## - X5
##
## No more variables to be removed.
##
##
## Backward Elimination Summary
## ---------------------------------------------------------------
## Variable AIC RSS Sum Sq R-Sq Adj. R-Sq
## ---------------------------------------------------------------
## Full Model 44.811 3.757 67.635 0.94737 0.92369
## X1 42.815 3.758 67.635 0.94737 0.92731
## X2 40.902 3.769 67.624 0.94721 0.93042
## X5 39.161 3.801 67.591 0.94675 0.93286
## ---------------------------------------------------------------
Stepwise Backward Regression for X4 eliminated model
# Stepwise Backward Regression based on p values (use a=0.05) #
ols_step_backward_p(model_wf_rm4_log, penter = 0.05)
# Stepwise AIC Backward Regression #
ols_step_backward_aic(model_wf_rm4_log)
Stepwise Backward Regression for X1 eliminated model
# Stepwise Backward Regression based on p values (use a=0.05) #
ols_step_backward_p(model_wf_rm1_log, penter = 0.05)
# Stepwise AIC Backward Regression #
ols_step_backward_aic(model_wf_rm1_log)
# For full model #
k <- ols_step_best_subset(model_wf_full_log)
k
| mindex | n | predictors | rsquare | adjr | predrsq | cp | aic | sbic | sbc | msep | fpe | apc | hsp |
| 1 | 1 | X4 | 0.803 | 0.796 | 0.772 | 48.9 | 68.4 | -19.8 | 72.6 | 0.538 | 0.536 | 0.225 | 0.0186 |
| 2 | 2 | X3 X4 | 0.873 | 0.864 | 0.844 | 24.2 | 57.2 | -30.5 | 62.8 | 0.373 | 0.369 | 0.155 | 0.0129 |
| 3 | 3 | X3 X4 X7 | 0.89 | 0.878 | 0.854 | 19.7 | 54.8 | -32.7 | 61.8 | 0.348 | 0.341 | 0.143 | 0.012 |
| 4 | 4 | X1 X4 X8 X9 | 0.921 | 0.908 | 0.886 | 10.1 | 47 | -38.1 | 55.4 | 0.272 | 0.264 | 0.111 | 0.00941 |
| 5 | 5 | X3 X4 X7 X8 X9 | 0.932 | 0.918 | 0.892 | 7.85 | 44.5 | -38.8 | 54.3 | 0.255 | 0.243 | 0.102 | 0.0088 |
| 6 | 6 | X3 X4 X6 X7 X8 X9 | 0.947 | 0.933 | 0.908 | 4.23 | 39.2 | -39.7 | 50.4 | 0.217 | 0.204 | 0.0857 | 0.00751 |
| 7 | 7 | X3 X4 X5 X6 X7 X8 X9 | 0.947 | 0.93 | 0.902 | 6.06 | 40.9 | -36.8 | 53.5 | 0.236 | 0.217 | 0.0912 | 0.00816 |
| 8 | 8 | X2 X3 X4 X5 X6 X7 X8 X9 | 0.947 | 0.927 | 0.896 | 8 | 42.8 | -33.8 | 56.8 | 0.259 | 0.233 | 0.0977 | 0.00895 |
| 9 | 9 | X1 X2 X3 X4 X5 X6 X7 X8 X9 | 0.947 | 0.924 | 0.886 | 10 | 44.8 | -30.8 | 60.2 | 0.286 | 0.25 | 0.105 | 0.00989 |
plot(k)
# For X4 eliminated model #
# k <- ols_step_best_subset(model_wf_rm4_log)
# k
# plot(k)
# For X1 eliminated model #
# k <- ols_step_best_subset(model_wf_rm1_log)
# k
# plot(k)
# build model 437896
model_wf_437896_log <- lm(log(y) ~ X4 + X3 + X7 + X8 + X9 + X6, data=table_wf)
ols_regress(model_wf_437896_log)
## Model Summary
## -------------------------------------------------------------
## R 0.973 RMSE 0.407
## R-Squared 0.947 Coef. Var 6.385
## Adj. R-Squared 0.933 MSE 0.165
## Pred R-Squared 0.908 MAE 0.273
## -------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------
## Regression 67.591 6 11.265 68.16 0.0000
## Residual 3.801 23 0.165
## Total 71.393 29
## -------------------------------------------------------------------
##
## Parameter Estimates
## ----------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## ----------------------------------------------------------------------------------------
## (Intercept) 2.692 0.445 6.046 0.000 1.771 3.613
## X4 0.109 0.026 0.499 4.244 0.000 0.056 0.162
## X3 0.184 0.032 0.476 5.698 0.000 0.117 0.251
## X7 4.085 1.213 0.406 3.367 0.003 1.575 6.595
## X8 0.612 0.133 0.493 4.614 0.000 0.337 0.886
## X9 -0.448 0.108 -0.450 -4.135 0.000 -0.672 -0.224
## X6 -0.368 0.146 -0.133 -2.526 0.019 -0.669 -0.066
## ----------------------------------------------------------------------------------------
confint(model_wf_437896_log, level = 1-(0.05/7)) # Bonferroni joint confidence interval #
## 0.357 % 99.643 %
## (Intercept) 1.37732232 4.00627202
## X4 0.03318876 0.18491189
## X3 0.08857700 0.27911109
## X7 0.50312634 7.66681371
## X8 0.22022202 1.00298907
## X9 -0.76727751 -0.12799849
## X6 -0.79716475 0.06213151
# Collinearity Diagnostics #
ols_vif_tol(model_wf_437896_log)
| Variables | Tolerance | VIF |
| X4 | 0.167 | 5.97 |
| X3 | 0.332 | 3.01 |
| X7 | 0.159 | 6.28 |
| X8 | 0.202 | 4.94 |
| X9 | 0.195 | 5.12 |
| X6 | 0.839 | 1.19 |
#Model Fit Assessment
ols_plot_diagnostics(model_wf_437896_log)
# Part & Partial Correlations
ols_test_correlation(model_wf_437896_log) # Correlation between observed residuals and expected residuals under normality.
## [1] 0.9837263
# Residual Normality Test
ols_test_normality(model_wf_437896_log) # Test for detecting violation of normality assumption. #If p-value is bigger, then no problem of non-normality #
## -----------------------------------------------
## Test Statistic pvalue
## -----------------------------------------------
## Shapiro-Wilk 0.9728 0.6175
## Kolmogorov-Smirnov 0.0997 0.8982
## Cramer-von Mises 4.8429 0.0000
## Anderson-Darling 0.2996 0.5612
## -----------------------------------------------
# Variable Contributions
ols_plot_added_variable(model_wf_437896_log)
# Residual Plus Component Plot
ols_plot_comp_plus_resid(model_wf_437896_log)
# build model 437
model_wf_437_log <- lm(log(y) ~ X4 + X3 + X7, data=table_wf)
ols_regress(model_wf_437_log)
## Model Summary
## -------------------------------------------------------------
## R 0.944 RMSE 0.549
## R-Squared 0.890 Coef. Var 8.618
## Adj. R-Squared 0.878 MSE 0.301
## Pred R-Squared 0.854 MAE 0.414
## -------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------
## Regression 63.565 3 21.188 70.378 0.0000
## Residual 7.828 26 0.301
## Total 71.393 29
## -------------------------------------------------------------------
##
## Parameter Estimates
## -------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## -------------------------------------------------------------------------------------
## (Intercept) 2.872 0.547 5.254 0.000 1.748 3.995
## X4 0.122 0.033 0.559 3.730 0.001 0.055 0.189
## X3 0.168 0.040 0.435 4.165 0.000 0.085 0.251
## X7 3.106 1.537 0.309 2.021 0.054 -0.053 6.266
## -------------------------------------------------------------------------------------
# Collinearity Diagnostics #
ols_vif_tol(model_wf_437_log)
| Variables | Tolerance | VIF |
| X4 | 0.188 | 5.32 |
| X3 | 0.386 | 2.59 |
| X7 | 0.181 | 5.53 |
#Model Fit Assessment
ols_plot_diagnostics(model_wf_437_log)
# Part & Partial Correlations
ols_test_correlation(model_wf_437_log) # Correlation between observed residuals and expected residuals under normality.
## [1] 0.9856766
# Residual Normality Test
ols_test_normality(model_wf_437_log) # Test for detecting violation of normality assumption. #If p-value is bigger, then no problem of non-normality #
## -----------------------------------------------
## Test Statistic pvalue
## -----------------------------------------------
## Shapiro-Wilk 0.9765 0.7267
## Kolmogorov-Smirnov 0.1033 0.8736
## Cramer-von Mises 3.1908 0.0000
## Anderson-Darling 0.3511 0.4469
## -----------------------------------------------
# Variable Contributions
ols_plot_added_variable(model_wf_437_log)
# Residual Plus Component Plot
ols_plot_comp_plus_resid(model_wf_437_log)
# Check PRESS Statistic
ols_press(model_wf_full)
## [1] 15880486
ols_press(model_wf_full_log)
## [1] 8.136733
ols_press(model_wf_437896_log)
## [1] 6.538275
ols_press(model_wf_437_log)
## [1] 10.43262
# ols_press(model_wf_137689_log)
# prediction power
ols_pred_rsq(model_wf_full)
## [1] 0.6179649
ols_pred_rsq(model_wf_full_log)
## [1] 0.8860283
ols_pred_rsq(model_wf_437896_log)
## [1] 0.908418
ols_pred_rsq(model_wf_437_log)
## [1] 0.8538697
# ols_pred_rsq(model_wf_137689_log)
# build X1*X8 eliminated log model
model_wf_18rm4_log <- lm(log(y) ~ X1*X8 + X3 + X6 + X7 + X9, data=table_wf)
# build X1*X8 eliminated log model
table_wf_resi <- table_wf%>% mutate(x1t8=X1*X8)
model_wf_1time8_log <- lm(log(y) ~ x1t8 + X3 + X6 + X7+ X9 , data=table_wf_resi)
# build X1*X4 eliminated log model
table_wf_resi <- table_wf%>% mutate(x1t4=X1*X4)
model_wf_1time4_log <- lm(log(y) ~ x1t4 + X3 + X6 + X7+ X8+ X9, data=table_wf_resi)
summary(model_wf_1time4_log)
# build X1/X4 eliminated log model
table_wf_resi <- table_wf%>% mutate(x14=X1/X4)
model_wf_1per4_log <- lm(log(y) ~ x14 + X3 + X6 + X7+ X8+ X9, data=table_wf_resi)
# build X4*X3 eliminated log model
model_wf_43rm1_log <- lm(log(y) ~ X9 + X4*X3 + X6 + X7 + X8 , data=table_wf)
# build X4*X9 eliminated log model
model_wf_49rm1_log <- lm(log(y) ~ X3 + X4*X9 + X6 + X7 + X8 , data=table_wf)
# build X4*X9 eliminated log model
model_wf_48rm1_log <- lm(log(y) ~ X3 + X4*X8 + X6 + X7 + X9 , data=table_wf)
# build X4*X9 eliminated log model
model_wf_47rm1_log <- lm(log(y) ~ X3 + X4*X7 + X6 + X9 + X8 , data=table_wf)
# build X4/X9 eliminated log model
table_wf_resi <- table_wf%>% mutate(x4p9=X4/X9)
model_wf_4per9_log <- lm(log(y) ~ X3 + x4p9 + X6 + X7 + X8 , data=table_wf_resi)
# build X3/X4vX8*X9 eliminated log model
model_wf_34v89_log <- lm(log(y) ~ X3*X4 + X8*X9 + X6 + X7, data=table_wf_resi)
# build X3/X4vX8*X9 eliminated log model
model_wf_34v89v67_log <- lm(log(y) ~ X3*X4 + X8*X9 + X6*X7, data=table_wf_resi)
# build X8/X9vX4*X3 eliminated log model
table_wf_resi <- table_wf%>% mutate(x8p9=X8/X9)
model_wf_8per9v43_log <- lm(log(y) ~ x8p9 + X4*X3 + X6 + X7, data=table_wf_resi)
# build X6/7vX8/X9vX4X3 eliminated log model
table_wf_resi <- table_wf%>% mutate(x8p9=X8/X9,x6p7=X6/X7)
model_wf_6p7v8p9v43_log <- lm(log(y) ~ x8p9 + X4*X3 + x6p7, data=table_wf_resi)
# build X8/X9vX4*X3rmX7 eliminated log model
table_wf_resi <- table_wf%>% mutate(x8p9=X8/X9)
model_wf_8per9v43rm7_log <- lm(log(y) ~ x8p9 + X4*X3 + X6, data=table_wf_resi)
# build X8/X9vX4*X3vX6/X7 eliminated log model
table_wf_resi <- table_wf%>% mutate(x8p9=X8/X9)
model_wf_8per9v43rm7_log <- lm(log(y) ~ x8p9 + X4*X3 + X6, data=table_wf_resi)
huxreg(model_wf_8per9v43rm7_log, model_wf_8per9v43_log, model_wf_43rm1_log, model_wf_6p7v8p9v43_log, model_wf_34v89_log, model_wf_34v89v67_log)
# Interaction regression for full
model_wf_full_log_inter <- lm(log(y)~ (X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9)^2, data=table_wf)
model_wf_aic_log_inter <- stepAIC(model_wf_full_log_inter)
## Start: AIC=-86.68
## log(y) ~ (X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9)^2
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 +
## X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X4 + X3:X5 +
## X3:X6 + X3:X7 + X3:X8 + X3:X9 + X4:X5 + X4:X6 + X4:X7 + X4:X8 +
## X4:X9 + X5:X6 + X5:X7 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 +
## X7:X9 + X8:X9
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 +
## X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X4 + X3:X5 +
## X3:X6 + X3:X7 + X3:X8 + X3:X9 + X4:X5 + X4:X6 + X4:X7 + X4:X8 +
## X4:X9 + X5:X6 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 +
## X8:X9
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 +
## X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X4 + X3:X5 +
## X3:X6 + X3:X7 + X3:X8 + X3:X9 + X4:X5 + X4:X6 + X4:X7 + X4:X8 +
## X4:X9 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 +
## X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X4 + X3:X5 +
## X3:X6 + X3:X7 + X3:X8 + X3:X9 + X4:X5 + X4:X6 + X4:X8 + X4:X9 +
## X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 +
## X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X4 + X3:X5 +
## X3:X6 + X3:X7 + X3:X8 + X3:X9 + X4:X5 + X4:X8 + X4:X9 + X5:X8 +
## X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 +
## X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X4 + X3:X5 +
## X3:X6 + X3:X7 + X3:X8 + X3:X9 + X4:X8 + X4:X9 + X5:X8 + X5:X9 +
## X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 +
## X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X4 + X3:X5 +
## X3:X6 + X3:X8 + X3:X9 + X4:X8 + X4:X9 + X5:X8 + X5:X9 + X6:X8 +
## X6:X9 + X7:X8 + X7:X9 + X8:X9
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 +
## X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X4 + X3:X5 +
## X3:X8 + X3:X9 + X4:X8 + X4:X9 + X5:X8 + X5:X9 + X6:X8 + X6:X9 +
## X7:X8 + X7:X9 + X8:X9
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 +
## X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X4 + X3:X8 +
## X3:X9 + X4:X8 + X4:X9 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 +
## X7:X9 + X8:X9
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 +
## X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X8 + X3:X9 +
## X4:X8 + X4:X9 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 +
## X8:X9
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 +
## X2:X4 + X2:X5 + X2:X6 + X2:X8 + X2:X9 + X3:X8 + X3:X9 + X4:X8 +
## X4:X9 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 +
## X2:X4 + X2:X5 + X2:X8 + X2:X9 + X3:X8 + X3:X9 + X4:X8 + X4:X9 +
## X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 +
## X2:X4 + X2:X8 + X2:X9 + X3:X8 + X3:X9 + X4:X8 + X4:X9 + X5:X8 +
## X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 +
## X2:X8 + X2:X9 + X3:X8 + X3:X9 + X4:X8 + X4:X9 + X5:X8 + X5:X9 +
## X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X8 +
## X2:X9 + X3:X8 + X3:X9 + X4:X8 + X4:X9 + X5:X8 + X5:X9 + X6:X8 +
## X6:X9 + X7:X8 + X7:X9 + X8:X9
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X8 + X1:X9 + X2:X8 + X2:X9 +
## X3:X8 + X3:X9 + X4:X8 + X4:X9 + X5:X8 + X5:X9 + X6:X8 + X6:X9 +
## X7:X8 + X7:X9 + X8:X9
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X4 + X1:X5 + X1:X8 + X1:X9 + X2:X8 + X2:X9 + X3:X8 +
## X3:X9 + X4:X8 + X4:X9 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 +
## X7:X9 + X8:X9
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X4 + X1:X8 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + X3:X9 +
## X4:X8 + X4:X9 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 +
## X8:X9
##
##
## Step: AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X8 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + X3:X9 + X4:X8 +
## X4:X9 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
##
## Df Sum of Sq RSS AIC
## - X1:X8 1 0.00125 0.27702 -88.546
## - X3:X9 1 0.00166 0.27743 -88.501
## - X1:X9 1 0.00171 0.27748 -88.496
## - X4:X9 1 0.00224 0.27801 -88.439
## - X5:X8 1 0.00375 0.27952 -88.276
## - X3:X8 1 0.01365 0.28942 -87.232
## - X5:X9 1 0.01394 0.28971 -87.202
## <none> 0.27577 -86.682
## - X4:X8 1 0.01926 0.29503 -86.656
## - X8:X9 1 0.02380 0.29957 -86.198
## - X1:X2 1 0.02492 0.30069 -86.086
## - X2:X8 1 0.02521 0.30098 -86.057
## - X6:X8 1 0.02975 0.30552 -85.608
## - X6:X9 1 0.03024 0.30601 -85.560
## - X2:X9 1 0.03404 0.30981 -85.190
## - X7:X8 1 0.04050 0.31627 -84.570
## - X7:X9 1 0.08581 0.36158 -80.554
## - X1:X3 1 1.65959 1.93536 -30.227
##
## Step: AIC=-88.55
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + X3:X9 + X4:X8 + X4:X9 +
## X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
##
## Df Sum of Sq RSS AIC
## - X4:X9 1 0.00103 0.27806 -90.434
## - X5:X8 1 0.00630 0.28332 -89.871
## - X3:X9 1 0.01467 0.29170 -88.997
## <none> 0.27702 -88.546
## - X5:X9 1 0.01990 0.29692 -88.465
## - X2:X8 1 0.02658 0.30360 -87.797
## - X1:X2 1 0.02706 0.30408 -87.750
## - X3:X8 1 0.02955 0.30658 -87.504
## - X6:X9 1 0.03164 0.30866 -87.301
## - X6:X8 1 0.03412 0.31114 -87.061
## - X4:X8 1 0.03459 0.31162 -87.015
## - X2:X9 1 0.03623 0.31325 -86.859
## - X8:X9 1 0.03768 0.31470 -86.720
## - X7:X8 1 0.04267 0.31969 -86.248
## - X1:X9 1 0.08036 0.35738 -82.904
## - X7:X9 1 0.08949 0.36652 -82.147
## - X1:X3 1 1.67908 1.95611 -31.907
##
## Step: AIC=-90.43
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + X3:X9 + X4:X8 + X5:X8 +
## X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
##
## Df Sum of Sq RSS AIC
## - X5:X8 1 0.01021 0.28826 -91.352
## <none> 0.27806 -90.434
## - X3:X9 1 0.01962 0.29768 -90.388
## - X1:X2 1 0.02775 0.30580 -89.580
## - X3:X8 1 0.02855 0.30661 -89.502
## - X5:X9 1 0.02886 0.30692 -89.471
## - X6:X9 1 0.03330 0.31136 -89.040
## - X8:X9 1 0.03752 0.31557 -88.637
## - X6:X8 1 0.04082 0.31888 -88.324
## - X7:X8 1 0.05026 0.32832 -87.449
## - X2:X8 1 0.07559 0.35365 -85.220
## - X1:X9 1 0.08639 0.36445 -84.317
## - X2:X9 1 0.09477 0.37282 -83.635
## - X7:X9 1 0.09547 0.37353 -83.579
## - X4:X8 1 0.11185 0.38991 -82.291
## - X1:X3 1 1.76157 2.03963 -32.653
##
## Step: AIC=-91.35
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + X3:X9 + X4:X8 + X5:X9 +
## X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
##
## Df Sum of Sq RSS AIC
## - X3:X9 1 0.01387 0.30213 -91.943
## <none> 0.28826 -91.352
## - X1:X2 1 0.02606 0.31432 -90.756
## - X6:X9 1 0.02892 0.31718 -90.484
## - X5:X9 1 0.03517 0.32343 -89.899
## - X6:X8 1 0.03526 0.32352 -89.891
## - X8:X9 1 0.03647 0.32474 -89.778
## - X7:X8 1 0.05418 0.34244 -88.186
## - X3:X8 1 0.06678 0.35505 -87.101
## - X1:X9 1 0.08233 0.37059 -85.815
## - X2:X8 1 0.09026 0.37852 -85.180
## - X4:X8 1 0.11594 0.40420 -83.211
## - X2:X9 1 0.12196 0.41023 -82.767
## - X7:X9 1 0.19579 0.48405 -77.803
## - X1:X3 1 1.79585 2.08412 -34.006
##
## Step: AIC=-91.94
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + X4:X8 + X5:X9 + X6:X8 +
## X6:X9 + X7:X8 + X7:X9 + X8:X9
##
## Df Sum of Sq RSS AIC
## - X6:X9 1 0.01568 0.31781 -92.425
## <none> 0.30213 -91.943
## - X6:X8 1 0.02200 0.32413 -91.834
## - X5:X9 1 0.02250 0.32464 -91.788
## - X1:X2 1 0.03453 0.33666 -90.696
## - X7:X8 1 0.04139 0.34352 -90.092
## - X8:X9 1 0.05081 0.35295 -89.279
## - X1:X9 1 0.07027 0.37241 -87.669
## - X2:X8 1 0.07640 0.37853 -87.180
## - X3:X8 1 0.09504 0.39717 -85.737
## - X4:X8 1 0.10557 0.40770 -84.952
## - X2:X9 1 0.10898 0.41111 -84.703
## - X7:X9 1 0.20828 0.51041 -78.212
## - X1:X3 1 1.80109 2.10322 -35.732
##
## Step: AIC=-92.43
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + X4:X8 + X5:X9 + X6:X8 +
## X7:X8 + X7:X9 + X8:X9
##
## Df Sum of Sq RSS AIC
## - X6:X8 1 0.00642 0.32423 -93.825
## - X1:X2 1 0.01993 0.33773 -92.601
## <none> 0.31781 -92.425
## - X5:X9 1 0.02449 0.34230 -92.198
## - X7:X8 1 0.02573 0.34354 -92.090
## - X8:X9 1 0.03582 0.35363 -91.221
## - X1:X9 1 0.06065 0.37846 -89.186
## - X2:X8 1 0.06122 0.37903 -89.140
## - X3:X8 1 0.08481 0.40262 -87.329
## - X4:X8 1 0.09252 0.41033 -86.760
## - X2:X9 1 0.09419 0.41200 -86.638
## - X7:X9 1 0.23406 0.55187 -77.869
## - X1:X3 1 1.89418 2.21199 -36.219
##
## Step: AIC=-93.83
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + X4:X8 + X5:X9 + X7:X8 +
## X7:X9 + X8:X9
##
## Df Sum of Sq RSS AIC
## - X7:X8 1 0.02050 0.34473 -93.986
## - X1:X2 1 0.02195 0.34617 -93.860
## <none> 0.32423 -93.825
## - X6 1 0.02936 0.35359 -93.225
## - X5:X9 1 0.03666 0.36089 -92.611
## - X8:X9 1 0.05990 0.38412 -90.740
## - X2:X8 1 0.10202 0.42624 -87.618
## - X2:X9 1 0.16870 0.49293 -83.258
## - X7:X9 1 0.23395 0.55817 -79.529
## - X1:X9 1 0.25322 0.57745 -78.510
## - X3:X8 1 0.26381 0.58803 -77.965
## - X4:X8 1 0.40165 0.72587 -71.647
## - X1:X3 1 1.90127 2.22550 -38.036
##
## Step: AIC=-93.99
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 +
## X1:X3 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + X4:X8 + X5:X9 + X7:X9 +
## X8:X9
##
## Df Sum of Sq RSS AIC
## - X1:X2 1 0.01744 0.36216 -94.506
## <none> 0.34473 -93.986
## - X6 1 0.02380 0.36853 -93.983
## - X8:X9 1 0.04553 0.39026 -92.264
## - X5:X9 1 0.04913 0.39386 -91.988
## - X2:X8 1 0.08418 0.42891 -89.432
## - X2:X9 1 0.15080 0.49553 -85.100
## - X1:X9 1 0.27457 0.61930 -78.411
## - X3:X8 1 0.30355 0.64828 -77.039
## - X7:X9 1 0.32337 0.66809 -76.136
## - X4:X8 1 0.42623 0.77096 -71.840
## - X1:X3 1 1.88785 2.23258 -39.941
##
## Step: AIC=-94.51
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X3 +
## X1:X9 + X2:X8 + X2:X9 + X3:X8 + X4:X8 + X5:X9 + X7:X9 + X8:X9
##
## Df Sum of Sq RSS AIC
## - X6 1 0.02085 0.38302 -94.826
## <none> 0.36216 -94.506
## - X8:X9 1 0.03703 0.39920 -93.585
## - X5:X9 1 0.03720 0.39936 -93.573
## - X2:X8 1 0.06882 0.43098 -91.286
## - X2:X9 1 0.13369 0.49585 -87.080
## - X1:X9 1 0.26000 0.62217 -80.272
## - X3:X8 1 0.29045 0.65261 -78.839
## - X7:X9 1 0.30630 0.66847 -78.119
## - X4:X8 1 0.40893 0.77109 -73.834
## - X1:X3 1 1.87267 2.23483 -41.911
##
## Step: AIC=-94.83
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X7 + X8 + X9 + X1:X3 + X1:X9 +
## X2:X8 + X2:X9 + X3:X8 + X4:X8 + X5:X9 + X7:X9 + X8:X9
##
## Df Sum of Sq RSS AIC
## - X5:X9 1 0.02376 0.40678 -95.021
## - X8:X9 1 0.02579 0.40880 -94.871
## <none> 0.38302 -94.826
## - X2:X8 1 0.04986 0.43287 -93.155
## - X2:X9 1 0.11290 0.49591 -89.077
## - X1:X9 1 0.23967 0.62268 -82.247
## - X3:X8 1 0.28141 0.66443 -80.301
## - X7:X9 1 0.28680 0.66981 -80.059
## - X4:X8 1 0.39670 0.77972 -75.501
## - X1:X3 1 2.28557 2.66859 -38.589
##
## Step: AIC=-95.02
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X7 + X8 + X9 + X1:X3 + X1:X9 +
## X2:X8 + X2:X9 + X3:X8 + X4:X8 + X7:X9 + X8:X9
##
## Df Sum of Sq RSS AIC
## - X8:X9 1 0.02190 0.42867 -95.448
## <none> 0.40678 -95.021
## - X2:X8 1 0.02957 0.43635 -94.916
## - X2:X9 1 0.08976 0.49654 -91.039
## - X5 1 0.18602 0.59280 -85.723
## - X7:X9 1 0.26382 0.67060 -82.024
## - X1:X9 1 0.32901 0.73579 -79.240
## - X3:X8 1 0.35450 0.76128 -78.219
## - X4:X8 1 0.43472 0.84149 -75.213
## - X1:X3 1 2.29575 2.70253 -40.210
##
## Step: AIC=-95.45
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X7 + X8 + X9 + X1:X3 + X1:X9 +
## X2:X8 + X2:X9 + X3:X8 + X4:X8 + X7:X9
##
## Df Sum of Sq RSS AIC
## - X2:X8 1 0.01037 0.43904 -96.731
## <none> 0.42867 -95.448
## - X2:X9 1 0.06896 0.49763 -92.973
## - X5 1 0.17518 0.60385 -87.169
## - X7:X9 1 0.24199 0.67066 -84.021
## - X1:X9 1 0.30722 0.73589 -81.236
## - X3:X8 1 0.34794 0.77661 -79.620
## - X4:X8 1 0.47278 0.90146 -75.148
## - X1:X3 1 2.41556 2.84423 -40.677
##
## Step: AIC=-96.73
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X7 + X8 + X9 + X1:X3 + X1:X9 +
## X2:X9 + X3:X8 + X4:X8 + X7:X9
##
## Df Sum of Sq RSS AIC
## <none> 0.43904 -96.731
## - X5 1 0.17280 0.61185 -88.774
## - X7:X9 1 0.24527 0.68431 -85.416
## - X2:X9 1 0.30033 0.73937 -83.095
## - X1:X9 1 0.31184 0.75088 -82.631
## - X3:X8 1 0.40524 0.84428 -79.114
## - X4:X8 1 0.66804 1.10708 -70.984
## - X1:X3 1 2.49992 2.93897 -41.694
# Interaction regression for remove X1-9
model_wf_rm1_log_inter <- lm(log(y) ~ (X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9)^2, data=table_wf)
model_wf_rm1_aic_log_inter <- stepAIC(model_wf_rm1_log_inter)
model_wf_rm2_log_inter <- lm(log(y) ~ (X1 + X3 + X4 + X5 + X6 + X7 + X8 + X9)^2, data=table_wf)
model_wf_rm2_aic_log_inter <- stepAIC(model_wf_rm2_log_inter)
model_wf_rm3_log_inter <- lm(log(y) ~ (X1 + X2 + X4 + X5 + X6 + X7+ X8 + X9)^2, data=table_wf)
model_wf_rm3_aic_log_inter <- stepAIC(model_wf_rm3_log_inter)
model_wf_rm5_log_inter <- lm(log(y) ~ (X1 + X2 + X3 + X4 + X6 + X7 + X8 + X9)^2, data=table_wf)
model_wf_rm5_aic_log_inter <- stepAIC(model_wf_rm5_log_inter)
model_wf_rm4_log_inter <- lm(log(y) ~ (X1 + X2 + X3 + X5 + X6 + X7 + X8 + X9)^2, data=table_wf)
model_wf_rm4_aic_log_inter <- stepAIC(model_wf_rm4_log_inter)
model_wf_rm6_log_inter <- lm(log(y) ~ (X2 + X3 + X1 + X5 + X4 + X7 + X8 + X9)^2, data=table_wf)
model_wf_rm6_aic_log_inter <- stepAIC(model_wf_rm6_log_inter)
model_wf_rm7_log_inter <- lm(log(y) ~ (X1 + X2 + X3 + X4 + X5 + X6 + X8 + X9)^2, data=table_wf)
model_wf_rm7_aic_log_inter <- stepAIC(model_wf_rm7_log_inter)
model_wf_rm8_log_inter <- lm(log(y) ~ (X1 + X2 + X3 + X4 + X5 + X6 + X7 + X9)^2, data=table_wf)
model_wf_rm8_aic_log_inter <- stepAIC(model_wf_rm8_log_inter)
model_wf_rm9_log_inter <- lm(log(y) ~ (X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8)^2, data=table_wf)
model_wf_rm9_aic_log_inter <- stepAIC(model_wf_rm9_log_inter)
# Interaction regression for 136789
model_wf_136789_log_inter <- lm(log(y) ~ (X3 + X1 + X6 + X7 + X8 + X9)^2, data=table_wf)
model_wf_136789_aic_log_inter <- stepAIC(model_wf_136789_log_inter)
# Interaction regression for 436789
model_wf_436789_log_inter <- lm(log(y) ~ (X3 + X4 + X6 + X7 + X8 + X9)^2, data=table_wf)
model_wf_436789_aic_log_inter <- stepAIC(model_wf_436789_log_inter)
# Interaction regression for 437
model_wf_437_log_inter <- lm(log(y) ~ (X3 + X4 + X7 )^2, data=table_wf)
model_wf_437_aic_log_inter <- stepAIC(model_wf_437_log_inter)
# Interaction regression for 489
model_wf_489_log_inter <- lm(log(y) ~ (X4 + X8 + X9 )^2, data=table_wf)
model_wf_489_aic_log_inter <- stepAIC(model_wf_489_log_inter)
# Interaction regression by groups
model_wf_3g_log_inter <- lm(log(y) ~ (log(X4) + log(X6))^2 + log(X8) + log(X9), data=table_wf)
model_wf_3g_aic_log_inter <- stepAIC(model_wf_3g_log_inter)
# Interaction regression by groups1
model_wf_3g1_log_inter <- lm(log(y) ~ (log(X3) +log(X7))^2 + log(X8) + log(X9), data=table_wf)
model_wf_3g1_aic_log_inter <- stepAIC(model_wf_3g1_log_inter)
# Comparison
huxreg(model_wf_rm1_aic_log_inter, model_wf_rm2_aic_log_inter, model_wf_rm3_aic_log_inter, model_wf_rm4_aic_log_inter, model_wf_rm5_aic_log_inter, model_wf_rm6_aic_log_inter, model_wf_rm7_aic_log_inter, model_wf_rm8_aic_log_inter, model_wf_rm9_aic_log_inter, model_wf_aic_log_inter)
huxreg(model_wf_136789_aic_log_inter,model_wf_436789_aic_log_inter, model_wf_437_log_inter, model_wf_489_log_inter, model_wf_3g_aic_log_inter, model_wf_3g1_aic_log_inter)
# build all log model
model_wf_all_log <- lm(log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + log(X7) + log(X8) + log(X9), data=table_wf)
ols_vif_tol(model_wf_all_log)
| Variables | Tolerance | VIF |
| log(X1) | 0.00608 | 164 |
| log(X2) | 0.0865 | 11.6 |
| log(X3) | 0.0816 | 12.3 |
| log(X4) | 0.00767 | 130 |
| log(X5) | 0.0885 | 11.3 |
| log(X6) | 0.421 | 2.37 |
| log(X7) | 0.108 | 9.29 |
| log(X8) | 0.193 | 5.18 |
| log(X9) | 0.187 | 5.35 |
model_wf_aic_all_log <- stepAIC(model_wf_all_log)
## Start: AIC=-84.46
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9)
##
## Df Sum of Sq RSS AIC
## - log(X7) 1 0.0027 0.9251 -86.371
## - log(X2) 1 0.0310 0.9534 -85.468
## - log(X4) 1 0.0370 0.9595 -85.277
## - log(X3) 1 0.0525 0.9750 -84.797
## - log(X5) 1 0.0574 0.9798 -84.647
## <none> 0.9224 -84.459
## - log(X6) 1 0.2329 1.1553 -79.705
## - log(X1) 1 0.2712 1.1936 -78.727
## - log(X9) 1 3.4818 4.4043 -39.559
## - log(X8) 1 3.6487 4.5711 -38.443
##
## Step: AIC=-86.37
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X8) + log(X9)
##
## Df Sum of Sq RSS AIC
## - log(X4) 1 0.0370 0.9621 -87.193
## - log(X2) 1 0.0559 0.9810 -86.611
## <none> 0.9251 -86.371
## - log(X3) 1 0.0777 1.0028 -85.953
## - log(X5) 1 0.0983 1.0234 -85.341
## - log(X6) 1 0.3174 1.2425 -79.522
## - log(X1) 1 0.3899 1.3150 -77.820
## - log(X9) 1 3.4793 4.4044 -41.558
## - log(X8) 1 3.6745 4.5996 -40.257
##
## Step: AIC=-87.19
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X5) + log(X6) + log(X8) +
## log(X9)
##
## Df Sum of Sq RSS AIC
## - log(X2) 1 0.0369 0.9990 -88.065
## <none> 0.9621 -87.193
## - log(X5) 1 0.1419 1.1040 -85.067
## - log(X6) 1 0.2985 1.2607 -81.087
## - log(X3) 1 0.5087 1.4709 -76.460
## - log(X9) 1 3.4515 4.4137 -43.495
## - log(X8) 1 3.6420 4.6042 -42.227
## - log(X1) 1 3.8134 4.7755 -41.131
##
## Step: AIC=-88.07
## log(y) ~ log(X1) + log(X3) + log(X5) + log(X6) + log(X8) + log(X9)
##
## Df Sum of Sq RSS AIC
## <none> 0.9990 -88.065
## - log(X5) 1 0.1087 1.1077 -86.967
## - log(X6) 1 0.3805 1.3795 -80.384
## - log(X3) 1 0.8252 1.8242 -72.001
## - log(X9) 1 3.4549 4.4539 -45.222
## - log(X8) 1 3.7305 4.7295 -43.421
## - log(X1) 1 17.5601 18.5592 -2.407
ols_vif_tol(model_wf_aic_all_log)
| Variables | Tolerance | VIF |
| log(X1) | 0.263 | 3.8 |
| log(X3) | 0.603 | 1.66 |
| log(X5) | 0.22 | 4.55 |
| log(X6) | 0.71 | 1.41 |
| log(X8) | 0.201 | 4.99 |
| log(X9) | 0.191 | 5.22 |
# Interaction regression for all log model
model_wf_all_log_inter <- lm(log(y) ~ (log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + log(X7) + log(X8) + log(X9))^2, data=table_wf)
model_wf_aic_all_log_inter <- stepAIC(model_wf_all_log_inter)
## Start: AIC=-117.75
## log(y) ~ (log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9))^2
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) +
## log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) +
## log(X2):log(X9) + log(X3):log(X4) + log(X3):log(X5) + log(X3):log(X6) +
## log(X3):log(X7) + log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X5) +
## log(X4):log(X6) + log(X4):log(X7) + log(X4):log(X8) + log(X4):log(X9) +
## log(X5):log(X6) + log(X5):log(X7) + log(X5):log(X8) + log(X5):log(X9) +
## log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) +
## log(X8):log(X9)
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) +
## log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) +
## log(X2):log(X9) + log(X3):log(X4) + log(X3):log(X5) + log(X3):log(X6) +
## log(X3):log(X7) + log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X5) +
## log(X4):log(X6) + log(X4):log(X7) + log(X4):log(X8) + log(X4):log(X9) +
## log(X5):log(X6) + log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) +
## log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) +
## log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) +
## log(X2):log(X9) + log(X3):log(X4) + log(X3):log(X5) + log(X3):log(X6) +
## log(X3):log(X7) + log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X5) +
## log(X4):log(X6) + log(X4):log(X7) + log(X4):log(X8) + log(X4):log(X9) +
## log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) +
## log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) +
## log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) +
## log(X2):log(X9) + log(X3):log(X4) + log(X3):log(X5) + log(X3):log(X6) +
## log(X3):log(X7) + log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X5) +
## log(X4):log(X6) + log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) +
## log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) +
## log(X7):log(X9) + log(X8):log(X9)
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) +
## log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) +
## log(X2):log(X9) + log(X3):log(X4) + log(X3):log(X5) + log(X3):log(X6) +
## log(X3):log(X7) + log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X5) +
## log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) +
## log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) +
## log(X8):log(X9)
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) +
## log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) +
## log(X2):log(X9) + log(X3):log(X4) + log(X3):log(X5) + log(X3):log(X6) +
## log(X3):log(X7) + log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X8) +
## log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) +
## log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) +
## log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) +
## log(X2):log(X9) + log(X3):log(X4) + log(X3):log(X5) + log(X3):log(X6) +
## log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) +
## log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) +
## log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) +
## log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) +
## log(X2):log(X9) + log(X3):log(X4) + log(X3):log(X5) + log(X3):log(X8) +
## log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) +
## log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) +
## log(X7):log(X9) + log(X8):log(X9)
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) +
## log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) +
## log(X2):log(X9) + log(X3):log(X4) + log(X3):log(X8) + log(X3):log(X9) +
## log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) +
## log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) +
## log(X8):log(X9)
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) +
## log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) +
## log(X2):log(X9) + log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X8) +
## log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) +
## log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) +
## log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X8) + log(X2):log(X9) +
## log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) +
## log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) +
## log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) +
## log(X2):log(X5) + log(X2):log(X8) + log(X2):log(X9) + log(X3):log(X8) +
## log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) +
## log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) +
## log(X7):log(X9) + log(X8):log(X9)
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) +
## log(X2):log(X8) + log(X2):log(X9) + log(X3):log(X8) + log(X3):log(X9) +
## log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) +
## log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) +
## log(X8):log(X9)
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X8) +
## log(X2):log(X9) + log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X8) +
## log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) +
## log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X8) + log(X2):log(X9) +
## log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) +
## log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) +
## log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X8) +
## log(X1):log(X9) + log(X2):log(X8) + log(X2):log(X9) + log(X3):log(X8) +
## log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) +
## log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) +
## log(X7):log(X9) + log(X8):log(X9)
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X8) + log(X1):log(X9) +
## log(X2):log(X8) + log(X2):log(X9) + log(X3):log(X8) + log(X3):log(X9) +
## log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) +
## log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) +
## log(X8):log(X9)
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X4) + log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X8) +
## log(X2):log(X9) + log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X8) +
## log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) +
## log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
##
##
## Step: AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X8) + log(X2):log(X9) +
## log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) +
## log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) +
## log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
##
## Df Sum of Sq RSS AIC
## - log(X3):log(X8) 1 0.0000903 0.097990 -119.72
## - log(X3):log(X9) 1 0.0003856 0.098286 -119.63
## - log(X7):log(X8) 1 0.0008771 0.098777 -119.48
## - log(X2):log(X9) 1 0.0009602 0.098860 -119.46
## - log(X2):log(X8) 1 0.0012711 0.099171 -119.36
## - log(X4):log(X9) 1 0.0013307 0.099231 -119.34
## - log(X6):log(X8) 1 0.0013914 0.099291 -119.33
## - log(X7):log(X9) 1 0.0014236 0.099324 -119.32
## - log(X5):log(X8) 1 0.0014490 0.099349 -119.31
## - log(X5):log(X9) 1 0.0015560 0.099456 -119.28
## - log(X4):log(X8) 1 0.0016138 0.099514 -119.26
## - log(X6):log(X9) 1 0.0016209 0.099521 -119.26
## - log(X1):log(X9) 1 0.0025113 0.100411 -118.99
## - log(X1):log(X2) 1 0.0025483 0.100448 -118.98
## - log(X8):log(X9) 1 0.0025636 0.100464 -118.97
## - log(X1):log(X8) 1 0.0029919 0.100892 -118.85
## <none> 0.097900 -117.75
## - log(X1):log(X3) 1 0.0093915 0.107292 -117.00
##
## Step: AIC=-119.72
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X8) + log(X2):log(X9) +
## log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) +
## log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) +
## log(X7):log(X9) + log(X8):log(X9)
##
## Df Sum of Sq RSS AIC
## - log(X8):log(X9) 1 0.004379 0.10237 -120.41
## - log(X1):log(X2) 1 0.004551 0.10254 -120.36
## - log(X3):log(X9) 1 0.004640 0.10263 -120.33
## <none> 0.09799 -119.72
## - log(X7):log(X8) 1 0.006852 0.10484 -119.69
## - log(X7):log(X9) 1 0.011626 0.10962 -118.36
## - log(X6):log(X8) 1 0.012314 0.11030 -118.17
## - log(X6):log(X9) 1 0.016183 0.11417 -117.14
## - log(X1):log(X9) 1 0.022498 0.12049 -115.52
## - log(X4):log(X9) 1 0.024248 0.12224 -115.09
## - log(X1):log(X8) 1 0.025269 0.12326 -114.84
## - log(X2):log(X9) 1 0.025677 0.12367 -114.74
## - log(X1):log(X3) 1 0.027330 0.12532 -114.34
## - log(X5):log(X9) 1 0.029018 0.12701 -113.94
## - log(X4):log(X8) 1 0.030440 0.12843 -113.61
## - log(X5):log(X8) 1 0.031030 0.12902 -113.47
## - log(X2):log(X8) 1 0.034946 0.13294 -112.57
##
## Step: AIC=-120.41
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X8) + log(X2):log(X9) +
## log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) +
## log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) +
## log(X7):log(X9)
##
## Df Sum of Sq RSS AIC
## - log(X1):log(X2) 1 0.0019105 0.10428 -121.86
## - log(X7):log(X8) 1 0.0042768 0.10665 -121.18
## <none> 0.10237 -120.41
## - log(X7):log(X9) 1 0.0085205 0.11089 -120.01
## - log(X6):log(X8) 1 0.0088118 0.11118 -119.93
## - log(X6):log(X9) 1 0.0122126 0.11458 -119.03
## - log(X1):log(X9) 1 0.0181216 0.12049 -117.52
## - log(X4):log(X9) 1 0.0203055 0.12268 -116.98
## - log(X3):log(X9) 1 0.0205839 0.12295 -116.91
## - log(X1):log(X8) 1 0.0209330 0.12330 -116.83
## - log(X2):log(X9) 1 0.0213148 0.12368 -116.74
## - log(X1):log(X3) 1 0.0237155 0.12609 -116.16
## - log(X5):log(X9) 1 0.0249321 0.12730 -115.87
## - log(X4):log(X8) 1 0.0267336 0.12910 -115.45
## - log(X5):log(X8) 1 0.0270663 0.12944 -115.37
## - log(X2):log(X8) 1 0.0305761 0.13295 -114.57
##
## Step: AIC=-121.86
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X8) +
## log(X1):log(X9) + log(X2):log(X8) + log(X2):log(X9) + log(X3):log(X9) +
## log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) +
## log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9)
##
## Df Sum of Sq RSS AIC
## - log(X7):log(X8) 1 0.0055209 0.10980 -122.31
## <none> 0.10428 -121.86
## - log(X7):log(X9) 1 0.0100914 0.11437 -121.08
## - log(X6):log(X8) 1 0.0111603 0.11544 -120.81
## - log(X6):log(X9) 1 0.0144674 0.11875 -119.96
## - log(X2):log(X9) 1 0.0220327 0.12631 -118.11
## - log(X1):log(X9) 1 0.0224953 0.12677 -118.00
## - log(X3):log(X9) 1 0.0225020 0.12678 -118.00
## - log(X4):log(X9) 1 0.0227039 0.12698 -117.95
## - log(X1):log(X3) 1 0.0232574 0.12754 -117.82
## - log(X5):log(X9) 1 0.0244629 0.12874 -117.53
## - log(X5):log(X8) 1 0.0257774 0.13006 -117.23
## - log(X1):log(X8) 1 0.0262811 0.13056 -117.11
## - log(X4):log(X8) 1 0.0296051 0.13389 -116.36
## - log(X2):log(X8) 1 0.0312323 0.13551 -116.00
##
## Step: AIC=-122.31
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X8) +
## log(X1):log(X9) + log(X2):log(X8) + log(X2):log(X9) + log(X3):log(X9) +
## log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) +
## log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X9)
##
## Df Sum of Sq RSS AIC
## <none> 0.10980 -122.309
## - log(X3):log(X9) 1 0.020900 0.13070 -119.081
## - log(X1):log(X3) 1 0.021290 0.13109 -118.992
## - log(X7):log(X9) 1 0.023868 0.13367 -118.408
## - log(X6):log(X8) 1 0.030444 0.14024 -116.967
## - log(X5):log(X8) 1 0.032198 0.14200 -116.594
## - log(X1):log(X9) 1 0.034511 0.14431 -116.109
## - log(X5):log(X9) 1 0.036577 0.14638 -115.683
## - log(X6):log(X9) 1 0.057094 0.16689 -111.748
## - log(X1):log(X8) 1 0.058955 0.16876 -111.415
## - log(X2):log(X9) 1 0.067720 0.17752 -109.896
## - log(X4):log(X9) 1 0.086014 0.19581 -106.953
## - log(X2):log(X8) 1 0.117909 0.22771 -102.426
## - log(X4):log(X8) 1 0.199440 0.30924 -93.245
huxreg(model_wf_aic_log, model_wf_aic_all_log, model_wf_aic_log_inter, model_wf_aic_all_log_inter)
| (1) | (2) | (3) | (4) | |
| (Intercept) | 2.692 *** | 0.571 | -0.981 | -16.027 |
| (0.445) | (3.360) | (1.520) | (13.947) | |
| X3 | 0.184 *** | 0.493 *** | ||
| (0.032) | (0.066) | |||
| X4 | 0.109 *** | 0.168 ** | ||
| (0.026) | (0.054) | |||
| X6 | -0.368 * | |||
| (0.146) | ||||
| X7 | 4.085 ** | 2.515 | ||
| (1.213) | (1.258) | |||
| X8 | 0.612 *** | 0.586 *** | ||
| (0.133) | (0.087) | |||
| X9 | -0.448 *** | -0.248 * | ||
| (0.108) | (0.105) | |||
| log(X1) | 0.726 *** | 1.158 | ||
| (0.036) | (0.697) | |||
| log(X3) | 0.419 *** | 1.666 | ||
| (0.096) | (1.763) | |||
| log(X5) | 1.259 | 5.028 | ||
| (0.796) | (3.473) | |||
| log(X6) | -0.267 ** | -0.158 | ||
| (0.090) | (0.300) | |||
| log(X8) | 1.623 *** | 44.919 | ||
| (0.175) | (24.175) | |||
| log(X9) | -1.375 *** | -37.066 | ||
| (0.154) | (19.170) | |||
| X1 | 2.712 *** | |||
| (0.330) | ||||
| X2 | -24.452 *** | |||
| (5.631) | ||||
| X5 | 0.039 * | |||
| (0.016) | ||||
| X1:X3 | -0.416 *** | |||
| (0.045) | ||||
| X1:X9 | -0.118 ** | |||
| (0.036) | ||||
| X2:X9 | 4.643 ** | |||
| (1.449) | ||||
| X3:X8 | -0.051 ** | |||
| (0.014) | ||||
| X4:X8 | 0.071 *** | |||
| (0.015) | ||||
| X7:X9 | -0.995 * | |||
| (0.344) | ||||
| log(X2) | -0.733 | |||
| (0.338) | ||||
| log(X4) | -2.394 | |||
| (2.963) | ||||
| log(X7) | -0.408 | |||
| (0.518) | ||||
| log(X1):log(X3) | 0.823 | |||
| (0.707) | ||||
| log(X1):log(X8) | 1.242 | |||
| (0.641) | ||||
| log(X1):log(X9) | -1.000 | |||
| (0.674) | ||||
| log(X2):log(X8) | 1.089 * | |||
| (0.397) | ||||
| log(X2):log(X9) | -0.710 | |||
| (0.341) | ||||
| log(X3):log(X9) | 0.411 | |||
| (0.356) | ||||
| log(X4):log(X8) | -3.029 ** | |||
| (0.849) | ||||
| log(X4):log(X9) | 2.174 | |||
| (0.929) | ||||
| log(X5):log(X8) | -7.969 | |||
| (5.562) | ||||
| log(X5):log(X9) | 6.795 | |||
| (4.450) | ||||
| log(X6):log(X8) | 0.403 | |||
| (0.289) | ||||
| log(X6):log(X9) | -0.506 | |||
| (0.265) | ||||
| log(X7):log(X9) | 0.464 | |||
| (0.376) | ||||
| N | 30 | 30 | 30 | 30 |
| R2 | 0.947 | 0.986 | 0.994 | 0.998 |
| logLik | -11.581 | 8.464 | 20.797 | 41.586 |
| AIC | 39.161 | -0.929 | -9.594 | -35.172 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | ||||
# Mixed regression 1
model_wf_mix1 <- lm(log(y) ~ (X1 + X3 + X4 )^2 + log(X2+ X5+X6 + X7) + log(X8) + log(X9), data=table_wf)
model_wf_aic_mix1 <- stepAIC(model_wf_mix1)
## Start: AIC=-80.71
## log(y) ~ (X1 + X3 + X4)^2 + log(X2 + X5 + X6 + X7) + log(X8) +
## log(X9)
##
##
## Step: AIC=-80.71
## log(y) ~ X1 + X3 + X4 + log(X2 + X5 + X6 + X7) + log(X8) + log(X9) +
## X1:X3 + X1:X4
##
## Df Sum of Sq RSS AIC
## - log(X2 + X5 + X6 + X7) 1 0.0122 1.1294 -82.385
## <none> 1.1172 -80.711
## - X1:X4 1 0.2938 1.4110 -75.706
## - X1:X3 1 1.8665 2.9837 -53.241
## - log(X9) 1 3.2544 4.3716 -41.782
## - log(X8) 1 3.4346 4.5518 -40.570
##
## Step: AIC=-82.38
## log(y) ~ X1 + X3 + X4 + log(X8) + log(X9) + X1:X3 + X1:X4
##
## Df Sum of Sq RSS AIC
## <none> 1.1294 -82.385
## - X1:X4 1 0.3135 1.4429 -77.036
## - X1:X3 1 2.3929 3.5224 -50.262
## - log(X9) 1 3.9030 5.0324 -39.559
## - log(X8) 1 4.3068 5.4362 -37.243
# Mixed regression 2
model_wf_mix2 <- lm(log(y) ~ (log(X1) + log(X3) + log(X4) )^2 + (log(X2)+log(X5)+log(X6)+log(X7))^2 + log(X8) + log(X9), data=table_wf)
model_wf_aic_mix2 <- stepAIC(model_wf_mix2)
## Start: AIC=-81.81
## log(y) ~ (log(X1) + log(X3) + log(X4))^2 + (log(X2) + log(X5) +
## log(X6) + log(X7))^2 + log(X8) + log(X9)
##
##
## Step: AIC=-81.81
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X4) +
## log(X3):log(X4) + log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) +
## log(X5):log(X6) + log(X5):log(X7)
##
##
## Step: AIC=-81.81
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X4) +
## log(X3):log(X4) + log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) +
## log(X5):log(X6)
##
##
## Step: AIC=-81.81
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X4) +
## log(X3):log(X4) + log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7)
##
##
## Step: AIC=-81.81
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X4) +
## log(X3):log(X4) + log(X2):log(X5) + log(X2):log(X6)
##
##
## Step: AIC=-81.81
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X4) +
## log(X3):log(X4) + log(X2):log(X5)
##
##
## Step: AIC=-81.81
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X4) +
## log(X3):log(X4)
##
##
## Step: AIC=-81.81
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X5) + log(X6) +
## log(X7) + log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X4)
##
## Df Sum of Sq RSS AIC
## - log(X6) 1 0.0002 0.8820 -83.802
## - log(X1):log(X4) 1 0.0015 0.8833 -83.759
## - log(X5) 1 0.0082 0.8900 -83.531
## - log(X2) 1 0.0112 0.8930 -83.431
## - log(X1):log(X3) 1 0.0183 0.9001 -83.195
## - log(X7) 1 0.0187 0.9005 -83.181
## <none> 0.8818 -81.810
## - log(X9) 1 3.4852 4.3670 -35.813
## - log(X8) 1 3.6405 4.5223 -34.765
##
## Step: AIC=-83.8
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X5) + log(X7) +
## log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X4)
##
## Df Sum of Sq RSS AIC
## - log(X1):log(X4) 1 0.0043 0.8863 -85.658
## - log(X5) 1 0.0305 0.9125 -84.783
## - log(X2) 1 0.0553 0.9374 -83.977
## <none> 0.8820 -83.802
## - log(X7) 1 0.1162 0.9982 -82.091
## - log(X1):log(X3) 1 0.1300 1.0120 -81.678
## - log(X9) 1 3.4883 4.3704 -37.791
## - log(X8) 1 3.6480 4.5300 -36.714
##
## Step: AIC=-85.66
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X5) + log(X7) +
## log(X8) + log(X9) + log(X1):log(X3)
##
## Df Sum of Sq RSS AIC
## - log(X5) 1 0.0276 0.9138 -86.739
## - log(X2) 1 0.0541 0.9404 -85.880
## <none> 0.8863 -85.658
## - log(X7) 1 0.1577 1.0440 -82.744
## - log(X1):log(X3) 1 0.2690 1.1553 -79.705
## - log(X4) 1 0.2899 1.1762 -79.167
## - log(X9) 1 3.5066 4.3929 -39.636
## - log(X8) 1 3.6607 4.5470 -38.602
##
## Step: AIC=-86.74
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X7) + log(X8) +
## log(X9) + log(X1):log(X3)
##
## Df Sum of Sq RSS AIC
## - log(X2) 1 0.0365 0.9503 -87.565
## <none> 0.9138 -86.739
## - log(X7) 1 0.1453 1.0591 -84.312
## - log(X1):log(X3) 1 0.8295 1.7433 -69.362
## - log(X4) 1 0.9236 1.8374 -67.785
## - log(X9) 1 3.7486 4.6625 -39.850
## - log(X8) 1 4.2005 5.1143 -37.075
##
## Step: AIC=-87.56
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X7) + log(X8) + log(X9) +
## log(X1):log(X3)
##
## Df Sum of Sq RSS AIC
## <none> 0.9503 -87.565
## - log(X7) 1 0.1793 1.1296 -84.379
## - log(X1):log(X3) 1 0.7962 1.7466 -71.307
## - log(X4) 1 0.9195 1.8698 -69.261
## - log(X9) 1 4.1464 5.0968 -39.178
## - log(X8) 1 4.5091 5.4594 -37.116
# Mixed regression 3
model_wf_mix3 <- lm(log(y) ~ (X1 + X3 + X4 )^2 + (log(X2)+log(X5)+log(X6)+log(X7))^2 + log(X8) + log(X9), data=table_wf)
model_wf_aic_mix3 <- stepAIC(model_wf_mix3)
## Start: AIC=-81.81
## log(y) ~ (X1 + X3 + X4)^2 + (log(X2) + log(X5) + log(X6) + log(X7))^2 +
## log(X8) + log(X9)
##
##
## Step: AIC=-81.81
## log(y) ~ X1 + X3 + X4 + log(X2) + log(X5) + log(X6) + log(X7) +
## log(X8) + log(X9) + X1:X3 + X1:X4 + X3:X4 + log(X2):log(X5) +
## log(X2):log(X6) + log(X2):log(X7) + log(X5):log(X6) + log(X5):log(X7)
##
##
## Step: AIC=-81.81
## log(y) ~ X1 + X3 + X4 + log(X2) + log(X5) + log(X6) + log(X7) +
## log(X8) + log(X9) + X1:X3 + X1:X4 + X3:X4 + log(X2):log(X5) +
## log(X2):log(X6) + log(X2):log(X7) + log(X5):log(X6)
##
##
## Step: AIC=-81.81
## log(y) ~ X1 + X3 + X4 + log(X2) + log(X5) + log(X6) + log(X7) +
## log(X8) + log(X9) + X1:X3 + X1:X4 + X3:X4 + log(X2):log(X5) +
## log(X2):log(X6) + log(X2):log(X7)
##
##
## Step: AIC=-81.81
## log(y) ~ X1 + X3 + X4 + log(X2) + log(X5) + log(X6) + log(X7) +
## log(X8) + log(X9) + X1:X3 + X1:X4 + X3:X4 + log(X2):log(X5) +
## log(X2):log(X6)
##
##
## Step: AIC=-81.81
## log(y) ~ X1 + X3 + X4 + log(X2) + log(X5) + log(X6) + log(X7) +
## log(X8) + log(X9) + X1:X3 + X1:X4 + X3:X4 + log(X2):log(X5)
##
##
## Step: AIC=-81.81
## log(y) ~ X1 + X3 + X4 + log(X2) + log(X5) + log(X6) + log(X7) +
## log(X8) + log(X9) + X1:X3 + X1:X4 + X3:X4
##
##
## Step: AIC=-81.81
## log(y) ~ X1 + X3 + X4 + log(X2) + log(X5) + log(X6) + log(X7) +
## log(X8) + log(X9) + X1:X3 + X1:X4
##
## Df Sum of Sq RSS AIC
## - log(X6) 1 0.0002 0.8820 -83.802
## - log(X5) 1 0.0082 0.8900 -83.531
## - log(X2) 1 0.0112 0.8930 -83.431
## - log(X7) 1 0.0187 0.9005 -83.181
## - X1:X4 1 0.0379 0.9197 -82.547
## <none> 0.8818 -81.810
## - X1:X3 1 0.3016 1.1834 -74.984
## - log(X9) 1 3.4852 4.3670 -35.813
## - log(X8) 1 3.6405 4.5223 -34.765
##
## Step: AIC=-83.8
## log(y) ~ X1 + X3 + X4 + log(X2) + log(X5) + log(X7) + log(X8) +
## log(X9) + X1:X3 + X1:X4
##
## Df Sum of Sq RSS AIC
## - log(X5) 1 0.0305 0.9125 -84.783
## - log(X2) 1 0.0553 0.9374 -83.977
## <none> 0.8820 -83.802
## - log(X7) 1 0.1162 0.9982 -82.091
## - X1:X4 1 0.1973 1.0793 -79.746
## - X1:X3 1 1.9085 2.7905 -51.249
## - log(X9) 1 3.4883 4.3704 -37.791
## - log(X8) 1 3.6480 4.5300 -36.714
##
## Step: AIC=-84.78
## log(y) ~ X1 + X3 + X4 + log(X2) + log(X7) + log(X8) + log(X9) +
## X1:X3 + X1:X4
##
## Df Sum of Sq RSS AIC
## - log(X2) 1 0.0277 0.9402 -85.886
## <none> 0.9125 -84.783
## - log(X7) 1 0.1121 1.0246 -83.308
## - X1:X4 1 0.1731 1.0856 -81.571
## - X1:X3 1 1.8806 2.7930 -53.222
## - log(X9) 1 3.6887 4.6012 -38.246
## - log(X8) 1 4.1243 5.0368 -35.533
##
## Step: AIC=-85.89
## log(y) ~ X1 + X3 + X4 + log(X7) + log(X8) + log(X9) + X1:X3 +
## X1:X4
##
## Df Sum of Sq RSS AIC
## <none> 0.9402 -85.886
## - log(X7) 1 0.1892 1.1294 -82.385
## - X1:X4 1 0.2211 1.1613 -81.549
## - X1:X3 1 2.5494 3.4896 -48.542
## - log(X9) 1 4.0912 5.0314 -37.565
## - log(X8) 1 4.4859 5.4261 -35.300
huxreg(model_wf_aic_mix1, model_wf_aic_mix2, model_wf_aic_mix3)
| (1) | (2) | (3) | |
| (Intercept) | 2.598 *** | 2.908 *** | 2.044 *** |
| (0.228) | (0.702) | (0.343) | |
| X1 | 1.290 *** | 1.459 *** | |
| (0.232) | (0.232) | ||
| X3 | 0.296 *** | 0.275 *** | |
| (0.040) | (0.039) | ||
| X4 | 0.391 *** | 0.423 *** | |
| (0.042) | (0.042) | ||
| log(X8) | 1.575 *** | 1.631 *** | 1.628 *** |
| (0.172) | (0.160) | (0.163) | |
| log(X9) | -1.345 *** | -1.411 *** | -1.405 *** |
| (0.154) | (0.144) | (0.147) | |
| X1:X3 | -0.381 *** | -0.398 *** | |
| (0.056) | (0.053) | ||
| X1:X4 | 0.031 * | 0.026 * | |
| (0.012) | (0.012) | ||
| log(X1) | 0.143 | ||
| (0.184) | |||
| log(X3) | -1.810 ** | ||
| (0.490) | |||
| log(X4) | 3.546 *** | ||
| (0.769) | |||
| log(X7) | -0.408 | -0.428 | |
| (0.200) | (0.208) | ||
| log(X1):log(X3) | -0.756 *** | ||
| (0.176) | |||
| N | 30 | 30 | 30 |
| R2 | 0.984 | 0.987 | 0.987 |
| logLik | 6.624 | 9.214 | 9.375 |
| AIC | 4.752 | -0.428 | 1.250 |
| *** p < 0.001; ** p < 0.01; * p < 0.05. | |||
# Stepwise Regression based on p values for full model#
k <- ols_step_both_p(model_wf_full_log)
## Stepwise Selection Method
## ---------------------------
##
## Candidate Terms:
##
## 1. X1
## 2. X2
## 3. X3
## 4. X4
## 5. X5
## 6. X6
## 7. X7
## 8. X8
## 9. X9
##
## We are selecting variables based on p value...
##
## Variables Entered/Removed:
##
## - X4 added
## - X3 added
## - X7 added
##
## No more variables to be added/removed.
##
##
## Final Model Output
## ------------------
##
## Model Summary
## -------------------------------------------------------------
## R 0.944 RMSE 0.549
## R-Squared 0.890 Coef. Var 8.618
## Adj. R-Squared 0.878 MSE 0.301
## Pred R-Squared 0.854 MAE 0.414
## -------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------
## Regression 63.565 3 21.188 70.378 0.0000
## Residual 7.828 26 0.301
## Total 71.393 29
## -------------------------------------------------------------------
##
## Parameter Estimates
## -------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## -------------------------------------------------------------------------------------
## (Intercept) 2.872 0.547 5.254 0.000 1.748 3.995
## X4 0.122 0.033 0.559 3.730 0.001 0.055 0.189
## X3 0.168 0.040 0.435 4.165 0.000 0.085 0.251
## X7 3.106 1.537 0.309 2.021 0.054 -0.053 6.266
## -------------------------------------------------------------------------------------
k
##
## Stepwise Selection Summary
## ------------------------------------------------------------------------------------
## Added/ Adj.
## Step Variable Removed R-Square R-Square C(p) AIC RMSE
## ------------------------------------------------------------------------------------
## 1 X4 addition 0.803 0.796 48.8550 68.4060 0.7087
## 2 X3 addition 0.873 0.864 24.2130 57.2082 0.5792
## 3 X7 addition 0.890 0.878 19.6670 54.8305 0.5487
## ------------------------------------------------------------------------------------
# plot(k)
# Stepwise AIC Regression for full model#
k<- ols_step_both_aic(model_wf_full_log)
## Stepwise Selection Method
## -------------------------
##
## Candidate Terms:
##
## 1 . X1
## 2 . X2
## 3 . X3
## 4 . X4
## 5 . X5
## 6 . X6
## 7 . X7
## 8 . X8
## 9 . X9
##
##
## Variables Entered/Removed:
##
## - X4 added
## - X3 added
## - X7 added
## - X8 added
## - X9 added
## - X6 added
##
## No more variables to be added or removed.
k
##
##
## Stepwise Summary
## --------------------------------------------------------------------------
## Variable Method AIC RSS Sum Sq R-Sq Adj. R-Sq
## --------------------------------------------------------------------------
## X4 addition 68.406 14.063 57.330 0.80302 0.79599
## X3 addition 57.208 9.057 62.335 0.87313 0.86373
## X7 addition 54.830 7.828 63.565 0.89036 0.87771
## X8 addition 54.522 7.248 64.144 0.89848 0.88223
## X9 addition 44.504 4.856 66.537 0.93199 0.91782
## X6 addition 39.161 3.801 67.591 0.94675 0.93286
## --------------------------------------------------------------------------
# plot(k)
# Stepwise Regression based on p values for all log model #
k <- ols_step_both_p(model_wf_all_log)
## Stepwise Selection Method
## ---------------------------
##
## Candidate Terms:
##
## 1. log(X1)
## 2. log(X2)
## 3. log(X3)
## 4. log(X4)
## 5. log(X5)
## 6. log(X6)
## 7. log(X7)
## 8. log(X8)
## 9. log(X9)
##
## We are selecting variables based on p value...
##
## Variables Entered/Removed:
##
## - log(X4) added
##
## No more variables to be added/removed.
##
##
## Final Model Output
## ------------------
##
## Model Summary
## -------------------------------------------------------------
## R 0.954 RMSE 0.479
## R-Squared 0.910 Coef. Var 7.526
## Adj. R-Squared 0.907 MSE 0.230
## Pred R-Squared 0.896 MAE 0.353
## -------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## --------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## --------------------------------------------------------------------
## Regression 64.964 1 64.964 282.937 0.0000
## Residual 6.429 28 0.230
## Total 71.393 29
## --------------------------------------------------------------------
##
## Parameter Estimates
## -------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## -------------------------------------------------------------------------------------
## (Intercept) 4.189 0.156 26.803 0.000 3.868 4.509
## log(X4) 1.259 0.075 0.954 16.821 0.000 1.106 1.413
## -------------------------------------------------------------------------------------
k
##
## Stepwise Selection Summary
## -------------------------------------------------------------------------------------
## Added/ Adj.
## Step Variable Removed R-Square R-Square C(p) AIC RMSE
## -------------------------------------------------------------------------------------
## 1 log(X4) addition 0.910 0.907 113.3920 44.9246 0.4792
## -------------------------------------------------------------------------------------
#plot(k)
# Stepwise AIC Regression for all log model #
k <- ols_step_both_aic(model_wf_all_log)
## Stepwise Selection Method
## -------------------------
##
## Candidate Terms:
##
## 1 . log(X1)
## 2 . log(X2)
## 3 . log(X3)
## 4 . log(X4)
## 5 . log(X5)
## 6 . log(X6)
## 7 . log(X7)
## 8 . log(X8)
## 9 . log(X9)
##
##
## Variables Entered/Removed:
##
## - log(X4) added
## - log(X8) added
## - log(X9) added
## - log(X6) added
## - log(X1) added
## - log(X3) added
##
## No more variables to be added or removed.
k
##
##
## Stepwise Summary
## -------------------------------------------------------------------------
## Variable Method AIC RSS Sum Sq R-Sq Adj. R-Sq
## -------------------------------------------------------------------------
## log(X4) addition 44.925 6.429 64.964 0.90995 0.90673
## log(X8) addition 44.798 5.989 65.404 0.91611 0.90990
## log(X9) addition 14.099 2.014 69.379 0.97179 0.96854
## log(X6) addition 4.579 1.372 70.021 0.98079 0.97771
## log(X1) addition 2.166 1.184 70.209 0.98342 0.97996
## log(X3) addition 0.009 1.031 70.362 0.98556 0.98180
## -------------------------------------------------------------------------
# plot(k)
# Stepwise Regression based on p values for full log model #
# k <- ols_step_both_p(model_wf_full_log_inter)
# k
# plot(k)
# Stepwise AIC Regression for all full model #
# k <- ols_step_both_aic(model_wf_full_log_inter)
# k
# plot(k)
# Stepwise Regression based on p values for all log model #
# k <- ols_step_both_p(model_wf_all_log_inter)
# k
# plot(k)
# Stepwise AIC Regression for all log model #
# k <- ols_step_both_aic(model_wf_all_log_inter)
# k
# plot(k)
# Stepwise Regression based on p values for all log model #
# k <- ols_step_both_p(model_wf_mix2 )
# k
# plot(k)
# k <- ols_step_both_aic(model_wf_mix2)
# k
# plot(k)
# Stepwise Regression based on p values for X4 eliminated model#
# k <- ols_step_both_p(model_wf_rm4_log)
# k
# plot(k)
# Stepwise AIC Regression for X4 eliminated model#
# k<- ols_step_both_aic(model_wf_rm4_log)
# k
# plot(k)
# Stepwise Regression based on p values for X1 eliminated model#
# k <- ols_step_both_p(model_wf_rm1_log)
# k
# plot(k)
# Stepwise AIC Regression for X1 eliminated model#
# k<- ols_step_both_aic(model_wf_rm1_log)
# k
# plot(k)
# All Possible Regression for full log model #
# k <- ols_step_all_possible(model_wf_full_log)
# plot(k)
# head(arrange(k, desc(adjr)))
# All Possible Regression for all log model #
# k <- ols_step_all_possible(model_wf_all_log)
# plot(k)
# head(arrange(k, desc(adjr)))
# All Possible Regression for 3g log model #
#!!!!!!!!!!!! k <- ols_step_all_possible(model_wf_3g_log_inter)
# plot(k)
# head(arrange(k, desc(adjr)))
# All Possible Regression for mixed log model #
# k <- ols_step_all_possible(model_wf_mix2 )
# plot(k)
# head(arrange(k, desc(adjr)))
# All Possible Regression for X4 eliminated model #
# k <- ols_step_all_possible(model_wf_rm4_log)
# k
# plot(k)
# All Possible Regression for X1 eliminated model #
# k <- ols_step_all_possible(model_wf_rm1_log)
# k
# plot(k)
#Lack of Fit F Test
ols_pure_error_anova(lm(y~X1, data = table_wf))
ols_pure_error_anova(lm(y~X4, data = table_wf))
alias(lm(y ~ as.factor(X3) + as.factor(X4) + as.factor(X5) + as.factor(X6) + as.factor(X7), data=table_wf))
alias(lm(y ~ as.factor(X1) + as.factor(X8) , data=table_wf))
alias(lm(y ~ as.factor(X4) + as.factor(X9) , data=table_wf))
alias(lm(y ~ as.factor(X3) + as.factor(X6) + as.factor(X7) + as.factor(X8) + as.factor(X9) , data=table_wf))
ols_regress(model_wf_aic_all_log )
## Model Summary
## -------------------------------------------------------------
## R 0.993 RMSE 0.208
## R-Squared 0.986 Coef. Var 3.273
## Adj. R-Squared 0.982 MSE 0.043
## Pred R-Squared 0.975 MAE 0.136
## -------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## --------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## --------------------------------------------------------------------
## Regression 70.394 6 11.732 270.106 0.0000
## Residual 0.999 23 0.043
## Total 71.393 29
## --------------------------------------------------------------------
##
## Parameter Estimates
## ----------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## ----------------------------------------------------------------------------------------
## (Intercept) 0.571 3.360 0.170 0.866 -6.379 7.522
## log(X1) 0.726 0.036 0.967 20.107 0.000 0.651 0.800
## log(X3) 0.419 0.096 0.139 4.359 0.000 0.220 0.617
## log(X5) 1.259 0.796 0.083 1.582 0.127 -0.387 2.905
## log(X6) -0.267 0.090 -0.087 -2.960 0.007 -0.454 -0.080
## log(X8) 1.623 0.175 0.510 9.267 0.000 1.260 1.985
## log(X9) -1.375 0.154 -0.503 -8.919 0.000 -1.694 -1.056
## ----------------------------------------------------------------------------------------
summary(model_wf_aic_all_log)
##
## Call:
## lm(formula = log(y) ~ log(X1) + log(X3) + log(X5) + log(X6) +
## log(X8) + log(X9), data = table_wf)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.67722 -0.08003 0.01102 0.13879 0.25715
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.57120 3.36000 0.170 0.86650
## log(X1) 0.72550 0.03608 20.107 4.31e-16 ***
## log(X3) 0.41866 0.09605 4.359 0.00023 ***
## log(X5) 1.25873 0.79566 1.582 0.12731
## log(X6) -0.26702 0.09022 -2.960 0.00702 **
## log(X8) 1.62253 0.17508 9.267 3.15e-09 ***
## log(X9) -1.37489 0.15416 -8.919 6.33e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2084 on 23 degrees of freedom
## Multiple R-squared: 0.986, Adjusted R-squared: 0.9824
## F-statistic: 270.1 on 6 and 23 DF, p-value: < 2.2e-16
Anova(model_wf_aic_all_log)
| Sum Sq | Df | F value | Pr(>F) |
| 17.6 | 1 | 404 | 4.31e-16 |
| 0.825 | 1 | 19 | 0.00023 |
| 0.109 | 1 | 2.5 | 0.127 |
| 0.38 | 1 | 8.76 | 0.00702 |
| 3.73 | 1 | 85.9 | 3.15e-09 |
| 3.45 | 1 | 79.5 | 6.33e-09 |
| 0.999 | 23 |
# Collinearity Diagnostics #
ols_vif_tol(model_wf_aic_all_log)
| Variables | Tolerance | VIF |
| log(X1) | 0.263 | 3.8 |
| log(X3) | 0.603 | 1.66 |
| log(X5) | 0.22 | 4.55 |
| log(X6) | 0.71 | 1.41 |
| log(X8) | 0.201 | 4.99 |
| log(X9) | 0.191 | 5.22 |
#Model Fit Assessment
ols_plot_diagnostics(model_wf_aic_all_log)
# Part & Partial Correlations
ols_test_correlation(model_wf_aic_all_log) # Correlation between observed residuals and expected residuals under normality.
## [1] 0.9278999
# Residual Normality Test
ols_test_normality(model_wf_aic_all_log) # Test for detecting violation of normality assumption. #If p-value is bigger, then no problem of non-normality #
## -----------------------------------------------
## Test Statistic pvalue
## -----------------------------------------------
## Shapiro-Wilk 0.8746 0.0021
## Kolmogorov-Smirnov 0.0964 0.9180
## Cramer-von Mises 7.0221 0.0000
## Anderson-Darling 0.7277 0.0516
## -----------------------------------------------
# Variable Contributions
ols_plot_added_variable(model_wf_aic_all_log)
# Residual Plus Component Plot
ols_plot_comp_plus_resid(model_wf_aic_all_log)
summary(model_wf_mix1)
##
## Call:
## lm(formula = log(y) ~ (X1 + X3 + X4)^2 + log(X2 + X5 + X6 + X7) +
## log(X8) + log(X9), data = table_wf)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.59020 -0.09016 -0.00499 0.07966 0.35476
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.17550 5.79376 -0.030 0.9761
## X1 1.34608 0.26376 5.103 4.70e-05 ***
## X3 0.31833 0.06162 5.166 4.06e-05 ***
## X4 0.39212 0.04259 9.207 8.07e-09 ***
## log(X2 + X5 + X6 + X7) 0.62509 1.30490 0.479 0.6369
## log(X8) 1.53840 0.19147 8.035 7.68e-08 ***
## log(X9) -1.31591 0.16825 -7.821 1.18e-07 ***
## X1:X3 -0.39809 0.06721 -5.923 7.04e-06 ***
## X1:X4 0.03412 0.01452 2.350 0.0286 *
## X3:X4 NA NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2307 on 21 degrees of freedom
## Multiple R-squared: 0.9844, Adjusted R-squared: 0.9784
## F-statistic: 165.1 on 8 and 21 DF, p-value: < 2.2e-16
Anova(model_wf_mix1)
| Sum Sq | Df | F value | Pr(>F) |
| 0.434 | 1 | 8.16 | 0.00943 |
| 0.00231 | 1 | 0.0434 | 0.837 |
| 5.91 | 1 | 111 | 7.63e-10 |
| 0.0122 | 1 | 0.229 | 0.637 |
| 3.43 | 1 | 64.6 | 7.68e-08 |
| 3.25 | 1 | 61.2 | 1.18e-07 |
| 0 | |||
| 0 | |||
| 0 | |||
| 1.12 | 21 |
# Collinearity Diagnostics #
ols_vif_tol(model_wf_mix1)
| Variables | Tolerance | VIF |
| X1 | 0 | Inf |
| X3 | 0 | Inf |
| X4 | 0 | Inf |
| log(X2 + X5 + X6 + X7) | 0.112 | 8.91 |
| log(X8) | 0.205 | 4.87 |
| log(X9) | 0.197 | 5.08 |
| X1:X3 | 0 | Inf |
| X1:X4 | 0 | Inf |
| X3:X4 | 0 | Inf |
#Model Fit Assessment
ols_plot_diagnostics(model_wf_mix1)
# Part & Partial Correlations
ols_test_correlation(model_wf_mix1) # Correlation between observed residuals and expected residuals under normality.
## [1] 0.9658267
# Residual Normality Test
ols_test_normality(model_wf_mix1) # Test for detecting violation of normality assumption. #If p-value is bigger, then no problem of non-normality #
## -----------------------------------------------
## Test Statistic pvalue
## -----------------------------------------------
## Shapiro-Wilk 0.9419 0.1026
## Kolmogorov-Smirnov 0.1319 0.6261
## Cramer-von Mises 6.9129 0.0000
## Anderson-Darling 0.6116 0.1016
## -----------------------------------------------
# Variable Contributions
ols_plot_added_variable(model_wf_mix1)
# Residual Plus Component Plot
ols_plot_comp_plus_resid(model_wf_mix1)
# summary(model_wf_3g_aic_log_inter)
# Anova(model_wf_3g_aic_log_inter)
# Collinearity Diagnostics #
# ols_vif_tol(model_wf_3g_aic_log_inter)
#Model Fit Assessment
# ols_plot_diagnostics(model_wf_3g_aic_log_inter)
# Part & Partial Correlations
# ols_test_correlation(model_wf_3g_aic_log_inter) # Correlation between observed residuals and expected residuals under normality.
# Residual Normality Test
# ols_test_normality(model_wf_3g_aic_log_inter) # Test for detecting violation of normality assumption. #If p-value is bigger, then no problem of non-normality #
# Variable Contributions
# ols_plot_added_variable(model_wf_3g_aic_log_inter)
# Residual Plus Component Plot
# ols_plot_comp_plus_resid(model_wf_3g_aic_log_inter)
summary(model_wf_aic_all_log_inter)
##
## Call:
## lm(formula = log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) +
## log(X5) + log(X6) + log(X7) + log(X8) + log(X9) + log(X1):log(X3) +
## log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X8) + log(X2):log(X9) +
## log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) +
## log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X9),
## data = table_wf)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.114032 -0.038669 -0.003953 0.026220 0.160039
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -16.0272 13.9469 -1.149 0.28823
## log(X1) 1.1577 0.6966 1.662 0.14045
## log(X2) -0.7328 0.3383 -2.166 0.06698 .
## log(X3) 1.6656 1.7630 0.945 0.37625
## log(X4) -2.3936 2.9633 -0.808 0.44581
## log(X5) 5.0279 3.4728 1.448 0.19094
## log(X6) -0.1583 0.3004 -0.527 0.61463
## log(X7) -0.4075 0.5183 -0.786 0.45748
## log(X8) 44.9189 24.1746 1.858 0.10550
## log(X9) -37.0656 19.1702 -1.934 0.09443 .
## log(X1):log(X3) 0.8234 0.7067 1.165 0.28218
## log(X1):log(X8) 1.2421 0.6407 1.939 0.09372 .
## log(X1):log(X9) -0.9995 0.6739 -1.483 0.18156
## log(X2):log(X8) 1.0890 0.3972 2.742 0.02885 *
## log(X2):log(X9) -0.7095 0.3415 -2.078 0.07633 .
## log(X3):log(X9) 0.4112 0.3562 1.154 0.28626
## log(X4):log(X8) -3.0288 0.8494 -3.566 0.00915 **
## log(X4):log(X9) 2.1744 0.9286 2.342 0.05172 .
## log(X5):log(X8) -7.9687 5.5620 -1.433 0.19504
## log(X5):log(X9) 6.7951 4.4498 1.527 0.17059
## log(X6):log(X8) 0.4027 0.2891 1.393 0.20622
## log(X6):log(X9) -0.5057 0.2651 -1.908 0.09807 .
## log(X7):log(X9) 0.4637 0.3759 1.234 0.25719
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1252 on 7 degrees of freedom
## Multiple R-squared: 0.9985, Adjusted R-squared: 0.9936
## F-statistic: 206.6 on 22 and 7 DF, p-value: 7.713e-08
Anova(model_wf_aic_all_log_inter)
| Sum Sq | Df | F value | Pr(>F) |
| 0.281 | 1 | 17.9 | 0.00386 |
| 0.00972 | 1 | 0.619 | 0.457 |
| 0.0102 | 1 | 0.648 | 0.447 |
| 0.000172 | 1 | 0.0109 | 0.92 |
| 0.0311 | 1 | 1.98 | 0.202 |
| 0.068 | 1 | 4.33 | 0.0759 |
| 0.00234 | 1 | 0.149 | 0.711 |
| 1.7 | 1 | 108 | 1.65e-05 |
| 1.73 | 1 | 111 | 1.53e-05 |
| 0.0213 | 1 | 1.36 | 0.282 |
| 0.059 | 1 | 3.76 | 0.0937 |
| 0.0345 | 1 | 2.2 | 0.182 |
| 0.118 | 1 | 7.52 | 0.0288 |
| 0.0677 | 1 | 4.32 | 0.0763 |
| 0.0209 | 1 | 1.33 | 0.286 |
| 0.199 | 1 | 12.7 | 0.00915 |
| 0.086 | 1 | 5.48 | 0.0517 |
| 0.0322 | 1 | 2.05 | 0.195 |
| 0.0366 | 1 | 2.33 | 0.171 |
| 0.0304 | 1 | 1.94 | 0.206 |
| 0.0571 | 1 | 3.64 | 0.0981 |
| 0.0239 | 1 | 1.52 | 0.257 |
| 0.11 | 7 |
# Collinearity Diagnostics #
# ols_vif_tol(model_wf_aic_all_log_inter)
#Model Fit Assessment
# ols_plot_diagnostics(model_wf_aic_all_log_inter)
# Part & Partial Correlations
# ols_test_correlation(model_wf_aic_all_log_inter) # Correlation between observed residuals and expected residuals under normality.
# Residual Normality Test
# ols_test_normality(model_wf_aic_all_log_inter) # Test for detecting violation of normality assumption. #If p-value is bigger, then no problem of non-normality #
# Variable Contributions
# ols_plot_added_variable(model_wf_aic_all_log_inter)
# Residual Plus Component Plot
# ols_plot_comp_plus_resid(model_wf_aic_all_log_inter)